perm filename V2ANS.TEX[TEX,DEK]2 blob
sn#363574 filedate 1978-06-22 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00025 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00004 00002 \input acphdr % Answer pages (double-check position of figures)
C00005 00003 %folio 649 galley 1 (C) Addison-Wesley 1978 *
C00024 00004 %folio 654 galley 2 (C) Addison-Wesley 1978 *
C00037 00005 %folio 659 galley 3 (C) Addison-Wesley 1978 *
C00053 00006 %folio 662 galley 4 (C) Addison-Wesley 1978 *
C00073 00007 %folio 666 galley 5 (C) Addison-Wesley 1978 *
C00093 00008 %folio 670 galley 6 (C) Addison-Wesley 1978 *
C00106 00009 %folio 674 galley 7 (C) Addison-Wesley 1978 *
C00120 00010 %folio 678 galley 8 (not on paper tape) (C) Addison-Wesley 1978 *
C00139 00011 %folio 687 galley 1 (C) Addison-Wesley 1978 *
C00152 00012 %folio 690 galley 2 (C) Addison-Wesley 1978 *
C00163 00013 %folio 693 galley 3 (C) Addison-Wesley 1978 *
C00183 00014 %folio 701 galley 4 (C) Addison-Wesley 1978 *
C00196 00015 %folio 707 galley 5 (C) Addison-Wesley 1978 *
C00207 00016 %folio 710 galley 6 (C) Addison-Wesley 1978 *
C00220 00017 %folio 715 galley 7 (C) Addison-Wesley 1978 *
C00227 00018 %folio 717 galley 8 (C) Addison-Wesley 1978 *
C00238 00019 %folio 720 galley 9 (C) Addison-Wesley 1978 *
C00249 00020 %folio 722 galley 10a (C) Addison-Wesley 1978 *
C00251 00021 %folio 723 galley 10b (C) Addison-Wesley 1978 *
C00255 00022 %folio 725 galley 1 (C) Addison-Wesley 1978 *
C00274 00023 %folio 731 galley 2a (C) Addison-Wesley 1978 *
C00286 00024 %folio 733 galley 2b (C) Addison-Wesley 1978 *
C00291 00025 %folio 735 galley 3a (C) Addison-Wesley 1978 *
C00297 ENDMK
C⊗;
\input acphdr % Answer pages (double-check position of figures)
\runninglefthead{ANSWERS TO EXERCISES}
\titlepage\setcpage0
\vfill
\tenpoint
\ctrline{ANSWER PAGES for THE ART OF COMPUTER PROGRAMMING}
\ctrline{(Volume 2)}
\ctrline{$\copyright$ 1978 Addison--Wesley Publishing Company, Inc.}
\vfill
\eject
\setcpage 1
%folio 649 galley 1 (C) Addison-Wesley 1978 *
\setcpage 600
\titlepage
\runninglefthead{ANSWERS TO EXERCISES}
\corners
\vskip 1in plus 30pt minus 10pt
\rjustline{\:;ANSWERS TO EXERCISES}
\vskip 1cm plus 30pt minus 10pt
\quoteformat{This branch of mathematics is the only one, I believe,\cr
in which good writers frequently get results entirely erroneous.\cr
. . . It may be doubted if there is a single\cr
extensive treatise on probabilities in existence\cr
which does not contain solutions absolutely indefensible.\cr}
author{C. S. PEIRCE, in {\sl Popular Science Monthly} (1878)}
\vskip 1cm plus 30pt minus 10pt
\sectionskip\exbegin{NOTES ON THE EXERCISES}
\ansno 1. An average problem for a mathematically-inclined reader.
\ansno 3. See W. J. LeVeque, {\sl Topics in Number Theory \bf 2}
(Reading, Mass.: Addison-Wesley, 1956), Chapter 3. ({\sl Note:} One
of the men who read a preliminary draft of the manuscript of
this book reported that he had discovered a truly remarkable
proof, which the margin of his copy was too small to contain.)
\ansbegin{3.1}
\ansno 1. (a)\9This will usually fail, since ``round''
telephone numbers are often selected by the telephone user when
possible. In some communities, telephone numbers are perhaps assigned
randomly. But it is a mistake in any case to try to get several
successive random numbers from the same page, since the same telephone
number is often listed several times in a row.
(b)\9But do you use the left-hand page or the right-hand
page? Say, use the left-hand page number, divide by 2, and take
the units digit. The total number of
pages should be a multiple of 20; but even so, this method will have some bias.
\eject % This gets part of Section 3.1 on the first page of the answers
(c)\9The markings on the faces will slightly bias
the die, but for practical purposes this method is quite satisfactory
(and it has been used by the author in the preparation of several
examples in this set of books). See {\sl Math.\ Comp.\ \bf 15} (1961),
94--95 for further discussion of these dice.
(d)\9(This is a hard question thrown in purposely
as a surprise.) The number is not random; if the average number
of emissions per minute is $m$, the probability that the counter
registers $k$ is $e↑{-m}m↑k/k!$ (the Poisson distribution). So
the digit 0 is selected with probability $e↑{-m} \sum ↓{k≥0}
m↑{10k}/(10k)!$, etc. The units digit will be even with probability
$e↑{-m}\mathop{\hjust{cosh}}m = {1\over2} + {1\over2}e↑{-2m}$, and
this is never equal to ${1}\over{2}$
(although the error is negligibly small when $m$ is large).
(e)\9Okay, provided that the time since the previous
digit selected in this way is random. However, there is possible
bias in borderline cases.
(f,g)\9No, people usually think of certain digits
(like 7) with higher probability.
(h)\9Okay; your assignment of numbers to the horses
had probability $1\over10$ of assigning a given digit to
the winning horse.
\ansno 2. The number of such sequences is the multinomial
coefficient $1000000!/(100000!)↑{10}$; the probability is this
number divided by $10↑{1000000}$, the total number of sequences of a million
digits. By Stirling's approximation the
probability is close to $1/(16π↑410↑{22}\sqrt{2π}) \approx 2.55
\times 10↑{-26}$, about one chance in $4 \times 10↑{25}$.
\ansno 3. 3040504030.
\ansno 4. Step K11 can be entered only from step K10 or step
K2, and in either case we find it impossible for $X$ to be zero
by a simple argument. If $X$ could be zero at that point, the
algorithm would not terminate.
\ansno 5. Since only $10↑{10}$ ten-digit numbers are possible,
some value of $X$ must be repeated during the first $10↑{10} + 1$
steps; and as soon as a value is repeated, the sequence continues
to repeat its past behavior.
\ansno 6. (a)\9Arguing as in the previous exercise, the sequence
must eventually repeat a value; let this repetition occur for
the first time at step $\mu + λ$, where $X↓{\mu+λ} = X↓\mu$.
(This condition defines $\mu$ and $λ$.) We have $0 ≤ \mu < m,\ 0
< λ ≤ m,\ \mu + λ ≤ m$. The values $\mu = 0, λ = m$ are attained
iff $f$ is a cyclic permutation; $\mu = m - 1, λ = 1$ occurs, e.g.,
if $X↓0 = 0,\ f(x) = x + 1$ for $x < m - 1$, and $f(m - 1) = m -
1$.
(b)\9We have, for $r ≥ n$, $X↓r = X↓n$ iff $r - n$ is
a multiple of $λ$ and $n ≥ \mu$. Hence $X↓{2n} = X↓n$ iff $n$ is
a multiple of $λ$ and $n ≥ \mu$. The desired results now follow
immediately. (Note: This is essentially a proof of the familiar
mathematical result that the powers of an element in a finite
semigroup include a unique idempotent element: take $X↓0 = a,\
f(x) = ax$.)
\ansno 7. (a)\9The least $n>0$ such that $n-(\lscr(n)-1)$ is a multiple of $λ$
and $\lscr(n)-1≥\mu$ is $n=2↑{\lceil \lg \max(\mu+1,λ)\rceil}-1+λ$. [This
may be compared with the least $n>0$ such that $X↓{2n}=X↓n$, namely
$λ\delta↓{\mu0}+\mu+λ-1-\left((\mu+λ-1)\mod λ\right)$.]
(b)\9Start with $X=Y=X↓0,\ k=m=1$. $\biglp$At key places in this algorithm we will
have $X=X↓{2m-k-1}$, $Y=X↓{m-1}$, and $m=\lscr(2m-k)$.$\bigrp$ To generate the
next random number, do the following steps: Set $X←f(X)$ and $k←k+1$. If
$X=Y$, stop (the period length $λ$ is equal to $m-k$). Otherwise if $k=0$, set
$Y←X,\ m←2m,\ k←m$. Output $X$.
{\sl Notes:} Brent has also
considered a more general method in which the successive values
of $Y=X↓{n↓i}$ satisfy $n↓1=0,\ n↓{i+1}=1+\lfloor pn↓i\rfloor$ where $p$ is
any number greater than 1. He showed that the best choice of $p$, approximately
2.4771, saves about 3 percent of the iterations by comparison with $p=2$. (See
exercise 4.5.4-4.)
The method in part (b) has a serious deficiency, however, since it might generate a
lot of nonrandom numbers before shutting off. For example, we might have a
particularly bad case such as $λ=1,\ \mu=2↑k$. A method based on Floyd's idea
in exercise 6(b), namely one which maintains $Y=X↓{2n}$ and $X=X↓n$ for
$n = 0,1,2,\ldotss$, will
stop before any number has been output twice. On the other hand if
$f$ is unknown (e.g., if we are receiving the
values $X↓0, X↓1, \ldots$ on-line from an outside source) or
if $f$ is difficult to apply, the following cycle detection algorithm
due to R. W. Gosper will be preferable: Maintain an auxiliary
table $T↓0, T↓1, \ldotss , T↓m$, where $m = \lfloor \lg n\rfloor$
when receiving $X↓n$. Initially $T↓0 ← X↓0$. For $n = 1, 2, \ldotss$,
compare $X↓n$ with each of $T↓0, \ldotss , T↓{\lfloor\lg n\rfloor}$; if
no match is found, set $T↓{e(n)} ← X↓n$, where $e(n) = \max\leftset e\relv
2↑e \hjust{ divides }n+1\rightset$. But if a match $X↓n =
T↓k$ is found, then $λ = n - \max\leftset m \relv m < n \hjust{ and } e(m) =
k\rightset$.
This procedure does not stop before generating a number twice, but at most $\lceil
{2\over3}λ\rceil$ of the $X$'s will have occurred more than once.
\ansno 8. (a,b) $00, 00,\ldots$ [62 starting values]; $10, 10,\ldots$
[19]; $60, 60,\ldots$ [15]; $50, 50,\ldots$ [1]; $24, 57, 24,
57,\ldots$ [3]. (c) 42 or 69; these both lead to a set of fifteen
distinct values, namely (42 or 69), 76, 77, 92, 46, 11, 12,
14, 19, 36, 29, 84, 05, 02, 00.
\ansno 9. Since $X < b↑n$, we have $X↑2 < b↑{2n}$, and the middle square
is $\lfloor X↑2/b↑n\rfloor ≤ X↑2/b↑n$. If $X > 0$, then $X↑2/b↑n <
Xb↑n/b↑n = X$.
\ansno 10. If $X = ab↑n$, the next number of the sequence has
the same form; it is $(a↑2\mod b↑n)b↑n$. If $a$ is a multiple
of $b$ or of all the prime factors of $b$, the sequence will soon
degenerate to zero; if not, the sequence will degenerate into
a cycle of numbers having the same general form as $X$.
Further facts about the middle-square method have
been found by B. Jansson, {\sl Random Number Generators} (Stockholm:
Almqvist and Wiksell, 1966), Section 3A. Numerologists will
be interested to learn that the number 3792 is self-reproducing
in the four-digit middle-square method, since $3792↑2 = 14379264$;
furthermore (as Jansson has observed), it is ``self-reproducing''
in another sense too, since its prime factorization is $3\cdot 79 \cdot
2↑4$!
\ansno 11. The probability that $λ = 1$ and $\mu = 0$ is the probability
that $X↓1 = X↓0$, namely $1/m$. The probability that $λ = 1, \mu
= 1$, or $λ = 2, \mu = 0$, is the probability that $X↓1 ≠ X↓0$
and that $X↓2$ has a certain value, so it is $(1 - 1/m)(1/m)$.
Similarly, the probability that the sequence has any given $\mu$
and $λ$ is a function only of $\mu + λ$, namely
$$P(\mu , λ) = {1\over m}\prod ↓{1≤k<\mu +λ}\left(1-{k\over m}\right).$$
For the probability that $λ = 1$, we have
$$\sum ↓{\mu ≥0}{1\over m}\prod↓{1≤k≤\mu}\left(1-{k\over m}\right)
={1\over m}Q(m),$$
where $Q(m)$ is defined in Section 1.2.11.3, Eq.
(2). By Eq. (25) in that section, the probability is approximately
$6\sqrt{π/2m} \approx 1.25/\sqrt{m}$. The chance of Algorithm K converging
as it did is only about one in 80000; the author was decidedly
unlucky. But see exercise 15 for further comments on the ``colossalness.''
%\ansno 12. This answer begins with an aligned equation.
\penalty-200\vskip 12pt plus 6pt minus 6pt
\halign{\hjust to 19pt{\hfill#}⊗\quad\hfill$\dispstyle#$⊗$\dispstyle\;#$\hfill\cr
{\bf 12.}⊗\sum↓{\scriptstyle 1≤λ≤m \atop \scriptstyle 0≤\mu<m}λP(\mu,λ)⊗
={1\over m}\left(1+3\left(1-{1\over m}\right)+6\left(1-{1\over m}\right)
\left(1-{2\over m}\right)+\cdots\right)\cr
⊗⊗={1+Q(m) \over 2}.\cr}
\vskip 9 pt plus 6pt minus 6pt
\noindent
(See the previous answer. In general if $f(a↓0, a↓1,\ldotss) = \sum↓{n≥0}a↓n
\prod ↓{1≤k≤n}(1 - k/m)$ then $f(a↓0,
a↓1,\ldotss) = a↓0 + f(a↓1, a↓2,\ldotss) - f(a↓1, 2a↓2,\ldotss)/m$;
apply this identity with $a↓n = (n + 1)/2$.) Therefore the average
value of $λ$ (and, by symmetry of $P(\mu , λ)$, also of $\mu + 1$)
is approximately $\sqrt{πm/8}+ {1\over 3}$. The average value
of $\mu + λ$ is exactly $Q(m)$, approximately $\sqrt{πm/2} - {1\over 3}$.
[For alternative derivations and further results, including asymptotic
values for the moments, see A. Rapoport, {\sl Bull.\ Math.\ Biophysics
\bf 10} (1948), 145--157, and B. Harris, {\sl Annals Math.\ Stat.\
\bf 31} (1960), 1045--1062; see also I. M. Sobol, {\sl Theory of
Probability and its Applications \bf 9} (1964), 333--338. Sobol
discusses the asymptotic period length for the more general
sequence $X↓{n+1} = f(X↓n)$ if $n \neqv 0 \modulo{m};\ X↓{n+1}
= g(X↓n)$ if $n ≡ 0 \modulo{m}$; with both $f$ and $g$ random.]
\ansno 13. [Paul Purdom and John Williams, {\sl Trans.\ Amer.\ Math.\
Soc.\ \bf 133} (1968), 547--551.] Let $T↓{mn}$ be the number
of functions which have $n$ one-cycles and no cycles of length
greater than one. Then$$
T↓{mn}={m-1\choose n-1}m↑{m-n}.$$
(See Section 2.3.4.4.) {\sl Any} function is such a function followed by a
permutation of the $n$ elements that were the one-cycles. Hence $\sum↓{n≥1}
T↓{mn}\,n! = m↑m$.
Let $P↓{nk}$ be the number of permutations of $n$ elements in which the
longest cycle is of length $k$. Then the number of functions with a maximum
cycle of length $k$ is $\sum↓{n≥1}T↓{mn}P↓{nk}$. To get the average value of
$k$, we compute $\sum↓{k≥1}\sum↓{n≥1}kT↓{mn}P↓{nk}$, which by the result of
exercise 1.3.3-23 is $\sum↓{n≥1}T↓{mn}\,n!(n+{1\over 2}+ε↓n)c$
where $c \approx .62433$ and
$ε↓n→0$ as $n→∞$. Summing, we get the average value $cQ(m)+{1\over 2}c+\delta↓m$,
where
$\delta↓m→0$ as $m→∞$. (This is not substantially larger than the average
value when $X↓0$ is selected at random. The average value of $\max(\mu+λ)$ is
still unknown.)
%folio 654 galley 2 (C) Addison-Wesley 1978 *
\ansno 14. Let $c↓r(m)$ be the number of
functions with exactly $r$ different final cycles. From the
recurrence $c↓1(m) = (m - 1)! - \sum ↓{k>0}{m\choose k}(-1)↑k(m
- k)↑kc↓1(m - k)$, which comes by counting the number of functions
whose image contains at most $m - k$ elements, we find the solution
$c↓1(m) = m↑{m-1}Q(m)$. (Cf.\ exercise 1.2.11.3--16.) Another
way to obtain the value of $c↓1(m)$, which is perhaps more elegant
and revealing, is given in exercise 2.3.4.4--17. The value of
$c↓r(m)$ may be determined by solving the recurrence
$$c↓r(m) = \sum ↓{0<k<m}{m-1\choose k-1}c↓1(k)c↓{r-1}(m-k),$$
which has the solution
$$c↓r(m) = m↑{m-1}\left({1\over 0!}{1\comb[]r}
+ {1\over 1!}{2\comb[]r}{m - 1\over m} + {1\over
2!}{3\comb[]r} {m - 1\over m} {m - 2\over m} + \cdots\right).$$
The desired average value can now be computed;
it is (cf.\ exercise 12)
$$\eqalign{E↓m⊗= {1\over m} \left(H↓1 + 2H↓2 {m
- 1\over m} + 3H↓3 {m - 1\over m} {m - 2\over m} + \cdots\right)\cr
⊗= 1 + {1\over 2} {m - 1\over m} + {1\over 3} {m - 1\over m}
{m - 2\over m} + \cdotss.\cr}$$
This latter formula was obtained by quite different
means by Martin D. Kruskal, {\sl AMM \bf 61} (1954), 392--397.
Using the integral representation
$$E↓m = \int ↑{∞}↓{0}\left(\left(
1 + {x\over m}\right)↑m - 1\right) e↑{-x}\, {dx\over x},$$
he proved the asymptotic relation
$\lim↓{m→∞} (E↓m - {1\over 2}\ln m) = {1\over 2}(\gamma
+\ln 2)$.
For further results and references, see John Riordan,
{\sl Annals Math.\ Stat.\ \bf 33} (1962), 178--185.
\ansno 15. The probability that $f(x) ≠ x$ for all $x$
is ($m - 1)↑m/m$, which is approximately $1/e$. The existence
of a self-repeating value in an algorithm like Algorithm K is
therefore not ``colossal'' at all---it occurs with probability
$1 - 1/e \approx .63212$. The only ``colossal'' thing was that
the author happened to hit such a value when $X↓0$ was chosen
at random (see exercise 11).
\ansno 16. The sequence will repeat when a pair of successive
elements occurs for the second time. The maximum period is $m↑2$.
(Cf.\ next exercise.)
\ansno 17. After selecting $X↓0, \ldotss , X↓{k-1}$ arbitrarily,
let $X↓{n+1} = f(X↓n, \ldotss , X↓{n-k+1})$, where $0 ≤ x↓1,
\ldotss , x↓k < m$ implies that $0 ≤f(x↓1, \ldotss , x↓k) < m$.
The maximum period is $m↑k$. This is an obvious upper bound,
but it is not obvious that it can be attained; for a proof that
it can always be attained for suitable $f$, see exercises 3.2.2--17
and 3.2.2--21, and for the number of ways to attain it see exercise
2.3.4.2--23.
\ansno 18. Same as exercise 7, but use the $k$-tuple of elements
$(X↓n,\ldotss,X↓{n-k+1})$
in place of $X↓n$.
\ansbegin{3.2.1}
\ansno 1. Take $X↓0$ even, $a$
even, $c$ odd. Then $X↓n$ is odd for $n > 0$.
\ansno 2. Let $X↓r$ be the first repeated value in the
sequence. If $X↓r = X↓k$ for $0 < k < r$, we could prove that
$X↓{r-1} = X↓{k-1}$, since $X↓n$ uniquely determines $X↓{n-1}$
when $a$ is prime to $m$. Hence, $k = 0$.
\ansno 3. If $d$ is the greatest common
divisor of $a$ and $m$, the quantity $aX↓n$ can take on at most
$m/d$ values. The situation can be even worse; e.g., if $m =
2↑e$ and if $a$ is even, Eq. (6) shows that the sequence is
eventually constant.
\ansno 4. Induction on $k$.
\ansno 5. If $a$ is relatively prime to $m$, there is
a number $a↑\prime$ for which $aa↑\prime ≡ 1 \modulo{m}$. Then
$X↓{n-1} = (a↑\prime X↓n - a↑\prime c)\mod m$, and in general,
$$\eqalign{X↓{n-k}⊗=\left( (a↑\prime )↑kX↓n - c(a↑\prime + \cdots
+ (a↑\prime )↑k)\right)\mod m\cr ⊗= \left((a↑\prime
)↑kX↓n - c {(a↑\prime )↑{k+1} - a↑\prime \over a↑\prime - 1}\right)\mod
m\cr}$$
when $k > 0,\ n - k ≥ 0$. If $a$ is not relatively prime
to $m$, it is not possible to determine $X↓{n-1}$ when $X↓n$
is given; multiples of $m/\gcd(a, m)$ may be added to
$X↓{n-1}$ without changing the value of $X↓n$. (See also exercise
3.2.1.3--7.)
\ansbegin{3.2.1.1}
\ansno 1. Let $c↑\prime$ be a
solution to the congruence $ac↑\prime ≡ c \modulo{m}$. (Thus,
$c↑\prime = a↑\prime c\mod m$, if $a↑\prime$ is the number in
the answer to exercise 3.2.1--5.) Then we have
$$\vcenter{\mixtwo{LDA⊗X\cr
ADD⊗CPRIME\cr
MUL⊗A⊗\blackslug\cr}}$$
Overflow is possible on this addition operation. (From results derived later
in the chapter, it is probably best to take $c=a$ and to replace the {\:t ADD}
instruction by ``{\:t INCA 1}''. Then if $X↓0=0$, overflow will not occur
until the end of the period, so it won't occur in practice.)
\mixans 2. {⊗RANDM⊗STJ⊗1F\cr
⊗⊗LDA⊗XRAND\cr
⊗⊗MUL⊗A\cr
⊗⊗SLAX⊗5\cr
⊗⊗ADD⊗C⊗(or, {\tt INCA} $c$, if $c$ is small)\cr
⊗⊗STA⊗XRAND\cr
⊗1H⊗JNOV⊗*\cr
⊗⊗JMP⊗*-1\cr
⊗XRAND⊗CON⊗X\cr
⊗A⊗CON⊗$a$\cr
⊗C⊗CON⊗$c$⊗\blackslug\cr}
\yyskip
\noindent{\sl Note:} Locations {\tt A} and {\tt C} should probably be named
{\tt 2H} and {\tt 3H} to avoid conflict with other symbols, if this subroutine
is to be used by other programmers.
\ansno 3. See {\tt WM1} at the end of Program 4.2.3D.
\ansno 4. \def\\{\<\,\mathbin{\underline{\char'155\char'157\char'144}}\<\,}
Define the operation $x\\2↑e = y$ iff
$x ≡ y\modulo{2↑e}$ and $-2↑{e-1} ≤ y < 2↑{e-1}$. The congruential
sequence $\langle Y↓n\rangle$ defined by
$$Y↓0 = X↓0\\2↑{32},\qquad Y↓{n+1} = (aY↓n + c)\\2↑{32}$$
is easy to compute on 370-style machines, since
the lower half of the product of $y$ and $z$ is $(yz)\\2↑{32}$
for all two's complement numbers $y$ and $z$, and since addition
ignoring overflow also delivers its result $\!\null\\ 2↑{32}$. This
sequence has all the randomness properties of the standard linear
congruential sequence $\langle X↓n\rangle $, since $Y↓n ≡ X↓n\modulo
{2↑{32}}$. Indeed, the two's complement representation of $Y↓n$
is {\sl identical} to the binary representation of $X↓n$, for
all $n$. [G. Marsaglia and T. A. Bray first pointed this
out in {\sl CACM \bf 11} (1968), 757--759.]
\ansno 5. (a) Subtraction: {\tt LDA X, SUB Y, JANN *+2,
ADD M}.\xskip (b) Addition: {\tt LDA X, SUB M, ADD Y, JANN
*+2, ADD M}. (Note that if $m$ is near the word size, the instruction
``{\tt SUB M}'' must precede the instruction ``{\tt ADD Y}''.)
\ansno 6. The sequences are not essentially different,
since adding the constant $(m - c)$ has the same effect as subtracting
the constant $c$. The operation must be combined with
multiplication, so a subtractive process has little merit over
the additive one (at least in \MIX's case), except when it is
necessary to avoid affecting the overflow toggle.
\ansno 7. The prime factors of $z↑k - 1$ appear in the
factorization of $z↑{kr} - 1$. If $r$ is odd, the prime factors
of $z↑k + 1$ appear in the factorization of $z↑{kr} + 1$. And
$z↑{2k} - 1$ equals
$(z↑k - 1)(z↑k + 1)$.
%\ansno 8. appears within the following alignment.
\anskip
\halign{\hjust to 19pt{#}⊗\lft{\tt#}\quad⊗\lft{\tt#}\quad⊗
\lft{\rm#}\cr
{\hfill\bf 8. }⊗JOV⊗*+1
⊗(Ensure that overflow is off.)\cr
⊗LDA⊗X\cr
⊗MUL⊗A\cr
⊗STX⊗TEMP\cr
⊗ADD⊗TEMP⊗Add lower half to upper half.\cr
⊗JNOV⊗*+2⊗If $≥w$, subtract $w - 1$.\cr
⊗INCA⊗1⊗(Overflow is impossible in this step.)\quad\blackslug\cr}
\yyskip
\noindent{\sl Note:} Since addition on an $e$-bit
ones'-complement computer is $\mod\;(2↑e - 1)$, it is possible
to combine the techniques of exercises 4 and 8, producing $(yz)\mod(2↑e
- 1)$ by adding together the two $e$-bit halves of the product
$yz$, for all ones' complement numbers $y$ and $z$ regardless
of sign.
\ansno 9. The pairs $(0, w - 2)$, $(1, w - 1)$ are treated
as equivalent in input and output:
\yyskip
\halign{\hjust to 19pt{#}⊗\lft{\tt#}\quad⊗\lft{\tt#}\quad⊗
\lft{\rm#}\cr
⊗JOV⊗*+1\cr
⊗LDA⊗X\cr
⊗MUL⊗A⊗$aX = qw + r$\cr
⊗SLC⊗5⊗$\rA ← r,\ \rX ← q$.\cr
⊗STX⊗TEMP\cr
⊗ADD⊗TEMP\cr
⊗JNOV⊗*+2⊗Get $(r + q)\mod(w - 2)$.\cr
⊗INCA⊗2⊗Overflow is impossible.\cr
⊗ADD⊗TEMP\cr
⊗JNOV⊗*+3⊗Get $(r + 2q)\mod(w - 2)$.\cr
⊗INCA⊗2⊗Overflow is possible.\cr
⊗JOV⊗*-1⊗\hskip4.5cm\blackslug\cr}
%folio 659 galley 3 (C) Addison-Wesley 1978 *
\ansbegin{3.2.1.2}
\ansno 1. Period length $m$,
by Theorem A. (Cf.\ exercise 3.)
\ansno 2. Yes, these conditions imply the conditions
in Theorem A, since the only prime divisor of $2↑e$ is 2, and
any odd number is relatively prime to $2↑e$. (In fact, the conditions
of the exercise are {\sl necessary} and sufficient.)
\ansno 3. By Theorem A, we need $a ≡ 1 \modulo{4}$ and
$a ≡ 1\modulo{5}$. By Law D of Section 1.2.4, this is equivalent
to $a ≡ 1\modulo{20}$.
\ansno 4. We know $X↓{2↑{e-1}} ≡ 0
\modulo{2↑{e-1}}$ by using Theorem A in the case $m = 2↑{e-1}$.
Also using Theorem A for $m = 2↑e$, we know that $X↓{2↑{e-1}}
\neqv 0\modulo{2↑e}$. It follows
that $X↓{2↑{e-1}} = 2↑{e-1}$. More generally,
we can use Eq. 3.2.1--6 to prove that the second half of the
period is essentially like the first half, since $X↓{n+2↑{e-1}}
= (X↓n + 2↑{e-1})\mod 2↑e$. (The quarters
are similar too, see exercise 21.)
\ansno 5. We need $a ≡ 1 \modulo{p}$ for $p = 3, 11,
43, 281, 86171$. By Law D of Section 1.2.4, this is equivalent
to $a ≡ 1 \modulo{3 \cdot 11 \cdot 43 \cdot 281 \cdot 86171}$, so the {\sl
only} solution is the terrible multiplier $a = 1$.
\ansno 6. (Cf.\ previous exercise.) $a ≡ 1 \modulo {3\cdot
7 \cdot 11 \cdot 13 \cdot 37}$ implies that the solutions are $a =
1 + 111111k$, for $0 ≤ k ≤ 8$.
\ansno 7. Using the notation of the proof of Lemma Q,
$\mu$ is the smallest value such that $X↓{\mu+λ} = X↓\mu$;
so it is the smallest value such that $Y↓{\mu+λ} = Y↓\mu$
and $Z↓{\mu+λ} = Z↓\mu$. This shows that $\mu = \max(\mu
↓1, \ldotss , \mu ↓t)$. The highest achievable $\mu$ is $\max(e↓1,
\ldotss , e↓t)$, but nobody really wants to achieve it.
\ansno 8. $a↑2 ≡ 1 \modulo{8}$; so $a↑4 ≡ 1 \modulo
{16}$, $a↑8 ≡ 1 \modulo{32}$, etc. If $a \mod 4 = 3$, then $a - 1$ is
twice an odd number; thus, we have $(a↑{2↑{e-1}} - 1)/2 ≡ 0 \modulo
{2↑{e+1}/2}$, and this yields the desired result.
\ansno 9. Substitute for $X↓n$ in terms of $Y↓n$ and
simplify. If $X↓0$ mod 4 = 3, the formulas of the exercise do
not apply, but they do apply to the sequence $Z↓n = (-X↓n)\mod
2↑e$ which has essentially the same behavior.
\ansno 10. Only $m = 1$, 2, 4, $p↑e$, and $2p↑e$, for odd
primes $p$. In all other cases, the result of Theorem B is an
improvement over Euler's theorem (exercise 1.2.4--28).
\ansno 11. (a) Either $x + 1$ or $x - 1$ (not both) will
be a multiple of 4, so $x \mp 1 = q2↑f$, where $q$ is odd and
$f$ is greater than 1.\xskip (b) In the given circumstances, $f <
e$ and so $e ≥ 3$. We have $\pm x ≡ 1 \modulo{2↑f}$ and $\pm
x \neqv 1 \modulo{2↑{f+1}}$ and $f > 1$. Hence by applying
Lemma P, we find that $(\pm x)↑{2e-f-1} \neqv 1 \modulo
{2↑e}$, while $x↑{2e-f} = (\pm x)↑{2e-f} ≡ 1 \modulo {2↑e}$.
So the order is a divisor of $2↑{e-f}$, but not a divisor of
$2↑{e-f-1}$.\xskip (c) 1 has order 1; $2↑e - 1$ has order 2; the maximum
period when $e ≥ 3$ is therefore $2↑{e-2}$, and for $e≥4$ it is necessary to
have $f = 2$, that is, $x ≡ 4 \pm 1 \modulo {8}$.
\ansno 12. If $k$ is a proper divisor of $p - 1$ and
if $a↑k ≡ 1 \modulo{p}$, then by Lemma P $a↑{kp↑{e-1}}
≡ 1 \modulo {p↑e}$. Similarly, if $a↑{p-1}
≡ 1 \modulo {p↑2}$, we find that $a↑{(p-1)p↑{e-2}}
≡ 1 \modulo {p↑e}$. So in these cases $a$ is {\sl
not} primitive. Conversely, if $a↑{p-1} \neqv 1 \modulo
{p↑2}$, Theorem 1.2.4F and Lemma P tell us that $a↑{(p-1)p↑{e-2}}
\neqv 1 \modulo {p↑e}$, but $a↑{(p-1)p↑{e-1}}
≡ 1 \modulo {p↑e}$. So the order is a divisor
of $(p - 1)p↑{e-1}$ but not of $(p - 1)p↑{e-2}$; it therefore
has the form $kp↑{e-1}$, where $k$ divides $p - 1$. But if $a$
is primitive modulo $p$, the congruence $a↑{kp↑{e-1}}
≡ a↑k ≡ 1 \modulo {p}$ implies that $k = p - 1$.
\ansno 13. Let $λ$ be the order of $a$ modulo $p$. By
Theorem 1.2.4F, $λ$ is a divisor of $p - 1$. If $λ < p - 1$, then
$(p - 1)/λ$ has a prime factor, $q$.
\ansno 14. Let $0 < k < p$. If $a↑{p-1} ≡ 1 \modulo
{p↑2}$, then $(a + kp)↑{p-1} ≡ a↑{p-1} + (p - 1)a↑{p-2}kp \modulo
{p↑2}$; and this is $\neqv 1$, since $(p - 1)a↑{p-2}k$ is
not a multiple of $p$. By exercise 12, $a + kp$ is primitive
modulo $p↑e$.
\ansno 15. (a) If $λ↓1 = p↑{e↓1}↓{1} \ldotss p↑{e↓t}↓{t},
λ↓2 = p↑{f↓1}↓{1} \ldotss p↑{f↓t}↓{t}$, let $\kappa ↓1 = p↑{g↓1}↓{1}
\ldotss p↑{g↓t}↓{t}, \kappa ↓2 = p↑{h↓1}↓{1} \ldotss p↑{h↓t}↓{t}$,
where$$\vjust{
\halign{$#$⊗\lft{$\;#$}⊗\quad#\quad⊗$#$⊗\lft{$\;#$}⊗\qquad if\quad$#$\cr
g↓j⊗=e↓j⊗and⊗h↓j⊗=0,⊗e↓j<f↓j,\cr
g↓j⊗=0⊗and⊗h↓j⊗=f↓j,⊗e↓j≥f↓j.\cr}}$$
Now $a↑{\kappa ↓1}↓{1}, a↑{\kappa ↓2}↓{2}$ have periods
$λ↓1/\kappa ↓1, λ↓2/\kappa ↓2$, and the latter are relatively
prime. Furthermore $(λ↓1/\kappa ↓1)(λ↓2/\kappa ↓2) = λ$, so it suffices
to consider the case when $λ↓1$ is relatively prime to $λ↓2$,
that is, when $λ = λ↓1λ↓2$. Now since $(a↓1a↓2)↑λ≡1$, we
have $1 ≡ (a↓1a↓2)↑{λλ↓1} ≡ a↑{λλ↓1}↓{2}$; hence $λλ↓1$
is a multiple of $λ↓2$. This implies that $λ$ is a multiple
of $λ↓2$, since $λ↓1$ is relatively prime to $λ↓2$. Similarly,
$λ$ is a multiple of $λ↓1$; hence $λ$ is a multiple of $λ↓1λ↓2$.
But obviously $(a↓1a↓2)↑{λ↓1λ↓2} ≡ 1$, so $λ
= λ↓1λ↓2$.
(b) If $a↓1$ has order $λ(m)$ and if $a↓2$
has order $λ$, by part (a) $λ(m)$ must be a multiple of $λ$,
otherwise we could find an element of higher order, namely
of order lcm$\left(λ, λ(m)\right)$.
\ansno 16. (a) $f(x) = (x
- a)\biglp x↑{n-1} + (a + c↓1)x↑{n-2} + \cdots + (a↑{n-1}
+ \cdots + c↓{n-1})\bigrp + f(a)$.\xskip (b) The statement is clear
when $n = 0$. If $a$ is one root,
$f(x) ≡ (x - a)q(x)$;
therefore if $a↑\prime$ is any other root,
$$0 ≡ f(a↑\prime ) ≡ (a↑\prime - a)q(a↑\prime ),$$
and since $a↑\prime - a$ is not a multiple of $p$,
$a↑\prime$ must be a root of $q(x)$. So if $f(x)$ has more than
$n$ distinct roots, $q(x)$ has more than $n - 1$ distinct roots.
(c) $λ(p) ≥ p - 1$, since $f(x)$ must have degree $≥p - 1$ in
order to possess so many roots. But by Theorem 1.2.4F, $λ(p)
≤ p - 1$.
\ansno 17. By Lemma P, $11↑5 ≡ 1 \modulo{25}$, $11↑5 \neqv
1 \modulo{125}$, etc.; so the order of 11 is $5↑{e-1} \modulo
{5↑e}$, not the maximum value $λ(5↑e) = 4 \cdot 5↑{e-1}$. But by
Lemma Q the total period length is the least common multiple
of the period modulo $2↑e$ (namely $2↑{e-2}$) and the period
modulo $5↑e$ (namely $5↑{e-1}$), and this is $2↑{e-2}5↑{e-1}
= λ(10↑e)$. The period modulo $5↑e$ may be $5↑{e-1}$ or $2\cdot 5↑{e-1}$
or $4\cdot 5↑{e-1}$, without affecting the length of period modulo
$10↑e$, since the least common multiple is taken. The values
which are primitive modulo $5↑e$ are those congruent to 2, 3,
8, 12, 13, 17, 22, 23 modulo 25 (cf.\ exercise 12), namely 3,
13, 27, 37, 53, 67, 77, 83, 117, 123, 133, 147, 163, 173, 187,
197.
\ansno 18. According to Theorem C, $a \mod 8$ must
be 3 or 5. Knowing the period of $a$ modulo 5 and modulo 25
allows us to apply Lemma P to determine admissible values of
$a$ mod 25. Period $= 4 \cdot 5↑{e-1}$: 2, 3, 8, 12, 13, 17, 22,
23; period $= 2 \cdot 5↑{e-1}$: 4, 9, 14, 19; period $= 5↑{e-1}$:
6, 11, 16, 21. Each of these 16 values yields one value of
$a$, $0 ≤ a < 200$, with $a \mod 8 = 3$, and another value of $a$
with $a \mod 8 = 5$.
\ansno 19. One example appears in Table 3.3.4--1, line 26.
\ansno 20. (a) We have $AY↓n + X↓0 ≡ AY↓{n+k} + X↓0 \modulo
{m}$ iff $Y↓n ≡ Y↓{n+k} \modulo {m↑\prime}$.\xskip (b)(i) Obvious.\xskip
(ii) Theorem A.\xskip (iii) $(a↑n - 1)/(a - 1) ≡ 0 \modulo {2↑e}$
iff $a↑n ≡ 1 \modulo {2↑{e+1}}$; and if $a \neqv -1$,
the order of $a$ modulo $2↑{e+1}$ is twice its order modulo
$2↑e$.\xskip (iv) $(a↑n - 1)/(a - 1) ≡ 0 \modulo {p↑e}$ iff $a↑n
≡ 1$.
\ansno 21. $X↓{n+s} ≡ X↓n+X↓s$ by Eq. 3.2.1--6; and $s$ is a divisor of
$m$, since $s$ is a power of $p$ when $m$ is a power of $p$. Hence
a given integer $q$ is a multiple of $m/s$ iff $X↓{qs} ≡ 0$ iff $q$ is
a multiple of $m/\gcd(X↓s,m)$.
\ansbegin{3.2.1.3}
\ansno 1. $c = 1$ is always relatively
prime to $B↑5$; and every prime dividing $m = B↑5$ is a divisor
of $B$, so it divides $b = B↑2$ to at least the second power.
\ansno 2. 3, so the generator is not recommended in spite
of its long period.
\ansno 3. The potency is 18 in both cases (see next exercise).
\ansno 4. Since $a \mod 4 = 1$, we must have $a \mod 8
= 1$ or 5, so $b \mod 8 = 0$ or 4. If $b$ is an odd multiple of
4, and if $b↓1$ is a multiple of 8, clearly $b↑s ≡ 0 \modulo
{2↑e}$ implies that $b↑{s}↓{1} ≡ 0 \modulo {2↑e}$, so $b↓1$
cannot have higher potency.
\ansno 5. The potency is the smallest value of $s$ such
that $f↓js ≥ e↓j$ for all $j$.
\ansno 6. $m$ must be divisible by $2↑7$ or by $p↑4$ (for
odd prime $p$) in order to have a potency as high as 4. The
only values are $m = 2↑{27} + 1$ and $10↑9 - 1$.
\ansno 7. $a↑\prime = (1 - b + b↑2 - \cdots)\mod m$,
where the terms in $b↑s, b↑{s+1}$, etc., are dropped (if $s$
is the potency).
\ansno 8. Since $X↓n$ is always odd,
$$X↓{n+2} = (2↑{34} + 3 \cdot 2↑{18} + 9)X↓n \mod 2↑{35} = (2↑{34}
+ 6X↓{n+1} - 9X↓n)\mod 2↑{35}.$$
Given $Y↓n$ and $Y↓{n+1}$, the possibilities for
$$Y↓{n+2} \approx \left( 5 + 6(Y↓{n+1} + \epsilon ↓1) - 9(Y↓n
+ \epsilon ↓2)\right)\mod 10,$$
with $0 ≤ \epsilon ↓1 < 1,\ 0 ≤ \epsilon ↓2 < 1$,
are limited and nonrandom.
{\sl Note:} If the multiplier suggested in exercise
3 were, say, $2↑{33} + 2↑{18} + 2↑2 + 1$, instead of $2↑{23} +
2↑{14} + 2↑2 + 1$, we would similarly find $X↓{n+2} - 10X↓{n+1}
+ 25X↓n ≡\hjust{constant}\modulo{2↑{35}}$. In general, we do not
want $a \pm \delta$ to be divisible by high powers of 2 when
$\delta$ is small, else we get ``second order impotency.'' See
Section 3.3.4 for a more detailed discussion.
The generator which appears in this exercise is
discussed in an article by MacLaren and Marsaglia, {\sl JACM \bf 12}
(1965), 83--89. The deficiencies of such generators were first
demonstrated by M. Greenberger, {\sl CACM \bf 8} (1965), 177--179.
Yet generators like this were still in widespread use
more than ten years later (cf.\ the discussion of {\tt RANDU} in
Section 3.3.4).
%folio 662 galley 4 (C) Addison-Wesley 1978 *
\ansbegin{3.2.2}
\ansno 1. The method is useful
only with great caution. In the first place, $aU↓n$ is likely
to be so large that the addition of $c/m$ which follows will
lose almost all significance, and the ``mod 1'' operation will
nearly destroy any vestiges of significance which might remain.
We conclude that double-precision floating-point arithmetic is necessary.
Even with double precision, one must be sure that no rounding, etc.,
occurs to affect the numbers of the sequence in any way, since
that would destroy the theoretical grounds for the good behavior
of the sequence. (But see exercise 23.)
\ansno 2. $X↓{n+1}$ equals either $X↓{n-1} + X↓n$ or
$X↓{n-1} + X↓n - m$. If $X↓{n+1} < X↓n$ we must have $X↓{n+1}
= X↓{n-1} + X↓n - m;$ hence $X↓{n+1} < X↓{n-1}$.
\ansno 3. (a) The underlined numbers are $V[j]$ after step M3.
\def\\#1{$\underline{#1}$}
\def\0{\hskip 4.5pt}
\def\3{$\ldots$}
$$\vjust{\baselineskip0pt\lineskip0pt\halign{\tabskip 4.5pt
\rt{#}⊗\rt{#}⊗#⊗\lft{#}\tabskip0pt\cr
\noalign{\hrule}
Output:⊗initial⊗\vrule height 10.25pt depth 4pt⊗
0\04\05\06\02\00\03\hjust to 4.5pt{\hfill(}2\07\04\0
1\06\03\00\05)\qquad and repeats.\cr
\noalign{\hrule}
$V[0]$:⊗0⊗\vrule height 10.25pt depth 3pt⊗
\\4\0\\7\07\07\07\07\07\07\0\\4\0\\7\07\07\07\07\07\07\0\\4\0\\7\0\3\cr
$V[1]$:⊗3⊗\vrule height 9.25pt depth 3pt⊗
3\03\03\03\03\03\0\\2\0\\5\05\05\05\05\05\05\0\\2\0\\5\05\05\0\3\cr
$V[2]$:⊗2⊗\vrule height 9.25pt depth 3pt⊗
2\02\02\02\0\\0\0\\3\03\03\03\03\03\03\0\\0\0\\3\03\03\03\03\0\3\cr
$V[3]$:⊗5⊗\vrule height 9.25pt depth 4pt⊗
5\05\0\\6\0\\1\01\01\01\01\01\01\0\\6\0\\1\01\01\01\01\01\01\0\3\cr
\noalign{\hrule}
$X$:⊗⊗\vrule height 10.25pt depth 3pt⊗
4\07\06\01\00\03\02\05\04\07\06\01\00\03\02\05\04\07\0\3\cr
$Y$:⊗⊗\vrule height 9.25pt depth 4pt⊗
0\01\06\07\04\05\02\03\00\01\06\00\04\05\02\03\00\01\0\3\cr
\noalign{\hrule}}}$$
So the potency has been reduced to 1! (See further comments in the
answer to exercise 15.)
(b) The underlined numbers are $V[j]$ after step B2.
$$\vjust{\baselineskip0pt\lineskip0pt\halign{
\tabskip 4.5pt\rt{#}⊗\rt{#}⊗#⊗\lft{#}\cr
\noalign{\hrule}
Output:⊗initial⊗\vrule height 10.25pt depth 4pt⊗
2\03\06\05\07\00\00\05\03\0\3\04\06\hjust to 4.5pt{\hfill(}3\0
0\0\3\04\07\hjust to 4.5pt{)\hfill}\3\cr
\noalign{\hrule}
$V[0]$:⊗0⊗\vrule height 10.25pt depth 3pt⊗
0\00\00\00\00\00\0\\5\0\\4\04\0\3\01\01\01\01\0\3\01\01\0\3\cr
$V[1]$:⊗3⊗\vrule height 9.25pt depth 3pt⊗
3\0\\6\0\\1\01\01\01\01\01\01\0\3\00\00\00\0\\4\0\3\00\00\0\3\cr
$V[2]$:⊗2⊗\vrule height 9.25pt depth 3pt⊗
\\7\07\07\07\0\\3\03\03\03\0\\7\0\3\06\0\\2\02\02\0\3\07\0\\2\0\3\cr
$V[3]$:⊗5⊗\vrule height 9.25pt depth 4pt⊗
5\05\05\0\\0\00\0\\2\02\02\02\0\3\0\\3\03\0\\5\05\0\3\0\\3\03\0\3\cr
\noalign{\hrule}
$X$:⊗4⊗\vrule height 10.25pt depth 4pt⊗
7\06\01\00\03\02\05\04\07\0\3\03\02\05\04\0\3\03\02\0\3\cr
\noalign{\hrule}}}$$
In this case the output is considerably better than the input, it enters a
repeating cycle of length 40 after 46 steps: 236570 05314 72632 40110 37564
76025 12541 73625 03746 (30175 24061 52317 46203 74531 60425 16753 02647).
The cycle can be found easily by applying the method of exercise 3.1-7 to the
above array until a column is repeated.
\ansno 4. The low-order byte of many random sequences
(e.g., linear congruential sequences with $m = \hjust{word size}$) is
much less random than the high-order byte. See Section 3.2.1.1.
\ansno 5. The randomizing effect would be quite minimized,
because $V[j]$ would always contain a number in a certain
range, essentially $j/k ≤ V[j]/m < (j + 1)/k$. However, some
similar approaches could be used: we could take $Y↓n = X↓{n-1}$,
or we could choose $j$ from $X↓n$ by extracting some digits
from about the middle instead of at the extreme left. None of
these suggestions would produce a lengthening of the period analogous
to the behavior of Algorithm B.
\ansno 6. For example, if $\hjust{\tt X}↓n/m < {1\over 2}$, then
$\hjust{\tt X}↓{n+1} = 2\hjust{\tt X}↓n$.
\ansno 7. [W. Montel, {\sl Nieuw Archief voor Wiskunde}
(2) {\bf 1} (1897), 172--184.]$$
\def\\{\hskip 4.5pt}
\hjust{
\vjust{\halign{\rt{#}\cr
The subsequence of\cr
{\tt X} values:\cr}}
\quad
$\vcenter{\halign{\ctr{#}\cr
00\301\cr 00\310\cr .\\.\\.\cr 10\300\cr {\tt CONTENTS(A)}\cr}}$
\quad becomes:\quad
$\vcenter{\halign{\ctr{#}\cr
00\301\cr 00\310\cr .\\.\\.\cr 10\300\cr 00\300\cr {\tt CONTENTS(A)}\cr}}$
}$$
\ansno 8. We may assume that $X↓0 = 0,\ m = p↑e$,
as in the proof of Theorem 3.2.1.2A. First suppose that the
sequence has period length $p↑e$; it follows that the period
of the sequence mod $p↑f$ has length $p↑f$, for $1 ≤ f ≤ e$, otherwise some residues
mod $p↑f$ would never occur. Clearly, $c$ is not a multiple
of $p$, for otherwise each $X↓n$ would be a multiple of $p$.
If $p ≤ 3$, it is easy to establish the necessity of conditions
(iii) and (iv) by trial and error, so we may assume that $p
≥ 5$. If $d \neqv 0 \modulo{p}$ then $dx↑2 + ax + c ≡ d(x
+ a↓1)↑2 + c↓1 \modulo {p↑e}$ for some integers $a↓1$ and $c↓1$
and for all $x$; this quadratic takes the same value at the
points $x$ and $-x - 2a↓1$, so it cannot assume all values modulo
$p↑e$. Hence $d ≡ 0 \modulo{p}$; and if $a \neqv 1$,
we would have $dx↑2 + ax + c ≡ x \modulo {p}$ for some $x$,
contradicting the fact that the sequence mod $p$ has period
length $p$.
To show the sufficiency of the conditions, we
may assume by Theorem 3.2.1.2A and consideration of some trivial
cases that $m = p↑e$ where $e ≥ 2$. If $p = 2$, we have $X↓{n+p}
≡ X↓n + pc \modulo {p↑2}$, by trial; and if $p = 3$, we have
either $X↓{n+p} ≡ X↓n + pc \modulo {p↑2}$, for all $n$, or
$X↓{n+p} ≡ X↓n - pc \modulo {p↑2}$, for all $n$. For $p ≥ 5$,
we can prove that $X↓{n+p} ≡ X↓n + pc \modulo {p↑2}$: Let $d
= pr$, $a = 1 + ps$. Then if $X↓n ≡ cn + pY↓n \modulo {p↑2}$,
we must have $Y↓{n+1} ≡ n↑2c↑2r + ncs + Y↓n \modulo {p}$; hence
$Y↓n ≡ {n\choose 3}2c↑2r + {n\choose 2}(c↑{2}r + cs) \modulo
{p}$. Thus $Y↓p \mod p = 0$, and the desired relation has been
proved.
Now we can prove that the sequence $\langle X↓n\rangle$
of integers defined in the ``hint'' satisfies the relation
$$X↓{n+p↑f} ≡ X↓n + tp↑f\quad \modulo {p↑{f+1}},\qquad
n ≥ 0,$$
for some $t$ with $t \mod p ≠ 0$, and for all
$f ≥ 1$. This suffices to prove that the sequence $\langle X↓n
\mod p↑e\rangle$ has period length $p↑e$, for the length of
the period is a divisor of $p↑e$ but not a divisor of $p↑{e-1}$.
The above relation has already been established for $f = 1$,
and for $f > 1$ it can be proved by induction in the following
manner: Let
$$X↓{n+p↑f} ≡ X↓n + tp↑f + Z↓np↑{f+1}\quad \modulo
{p↑{f+2}};$$
then the quadratic law for generating the sequence,
with $d = pr$, $a = 1 + ps$, yields
$Z↓{n+1} ≡ 2rtnc + st + Z↓n\modulo {p}$.
It follows that $Z↓{n+p} ≡ Z↓n \modulo {p}$; hence
$$X↓{n+kp↑f} ≡ X↓n + k(tp↑f + Z↓np↑{f+1})\quad \modulo
{p↑{f+2}}$$
for $k = 1, 2, 3, \ldotss$; setting $k = p$ completes
the proof.
{\sl Notes:} If $f(x)$ is a polynomial of degree
higher than 2 and $X↓{n+1} = f(X↓n)$, the analysis is more complicated,
although we can use the fact that $f(m + p↑k) = f(m) + p↑kf↑\prime
(m) + p↑{2k}f↑{\prime\prime} (m)/2! + \cdots$ to prove that many polynomial
recurrences give the maximum period. For example, Coveyou has
proved that the period is $m = 2↑e$ if $f(0)$ is odd, $f↑\prime
(j) ≡ 1$, $f↑{\prime\prime}(j) ≡ 0$, and $f(j + 1) ≡ f(j) + 1 \modulo
{4}$ for $j = 0, 1, 2, 3$. [{\sl Studies in Applied Math.\ \bf 3} (1967),
70-111.]
\ansno 9. Let $X↓n = 4Y↓n + 2$; then the
sequence $Y↓n$ satisfies the quadratic recurrence $Y↓{n+1} =
(4Y↑{2}↓{n} + 5Y↓n + 1)\mod 2↑{e-2}$.
\ansno 10. {\sl Case 1:} $X↓0 = 0$, $X↓1 = 1$; hence $X↓n
≡ F↓n$. We seek the smallest $n$ for which $F↓n ≡ 0$ and $F↓{n+1}
≡ 1\modulo {2↑e}$. Since $F↓{2n} = F↓n(F↓{n-1} + F↓{n+1})$, $F↓{2n+1}
= F↑{2}↓{n} + F↑{2}↓{n+1}$, we find by induction on $e$ that,
for $e > 1$, $F↓{3\cdot2↑{e-1}}≡0$ and $F↓{3\cdot2↑{e-1}+1}≡2↑e+1
\modulo{2↑{e+1}}$.
This implies that the period is a divisor of $3 \cdot 2↑{e-1}$
but not a divisor of $3 \cdot 2↑{e-2}$, so it is either $3 \cdot 2↑{e-1}$
or $2↑{e-1}$. But $F↓{2↑{e-1}}$ is always
odd (since only $F↓{3n}$ is even).
{\sl Case 2:} $X↓0 = a$, $X↓1 = b$. Then
$X↓n ≡ aF↓{n-1} + bF↓n$; we need to find the smallest positive
$n$ with $a(F↓{n+1} - F↓n) + bF↓n ≡ a$ and $aF↓n + bF↓{n+1} ≡ b$.
This implies that $(b↑2 - ab - a↑2)F↓n ≡ 0$, $(b↑2 - ab - a↑2)(F↓{n+1}
- 1) ≡ 0$; and $b↑2 - ab - a↑2$ is odd (i.e., prime to $m$)
so the condition is equivalent to $F↓n ≡ 0$, $F↓{n+1} ≡ 1$.
Methods to determine the period of $F↓n$
for any modulus appear in an article by D. D. Wall, {\sl AMM
\bf 67} (1960), 525--532. Further facts about the Fibonacci
sequence mod $2↑e$ have been derived by B. Jansson [{\sl Random
Number Generators} (Stockholm: Almqvist and Wiksell, 1966),
Section 3C1].
\def\\#1{\penalty0\;\biglp\hjust{modulo }f(z)\hjust{ and }#1\bigrp}
\ansno 11. (a) We have $z↑λ = 1 + f(z)u(z) + p↑ev(z)$
for some $u(z), v(z)$, where $v(z) \neqv 0\\{p}$.
By the binomial theorem
$$z↑{λp} = 1 + p↑{e+1}v(z) + p↑{2e+1}v(z)↑2(p - 1)/2$$
plus further terms congruent to zero $\\{p↑{e+2}}$.
Since $p↑e>2$, we have $z↑{λp}≡1+p↑{e+1}v(z)\\{p↑{e+2}}$.
If $p↑{e+1}v(z) ≡ 0 \\{p↑{e+2}}$,
there exist polynomials $a(z), b(z)$ such
that $p↑{e+1}\biglp v(z) + pa(z)\bigrp = f(z)b(z)$. Since $f(0)
= 1$, this implies that $b(z)$ is a multiple of $p↑{e+1}$ (by
Gauss's Lemma 4.6.1G); hence $v(z) ≡ 0\\{p}$,
a contradiction.
(b) If $z↑λ - 1 = f(z)u(z) + p↑ev(z)$, then
$$G(z) = u(z)/(z↑λ - 1) + p↑ev(z)/f(z)(z↑λ - 1);$$
hence $A↓{n+λ} ≡ A↓n \modulo {p↑e}$ for large
$n$. Conversely, if $\langle A↓n\rangle$ has the latter property
then $G(z) = u(z) + v(z)/(1 - z↑λ) + p↑eH(z)$, for some polynomials
$u(z)$ and $v(z)$, and some power series $H(z)$, all with integer
coefficients. This implies the identity $1 - z↑λ = u(z)f(z)(1 - z↑λ)
+ v(z)f(z) + p↑eH(z)f(z)(1 - z↑λ)$; and $H(z)f(z)(1 - z↑λ)$
is a polynomial since the other terms of the equation are polynomials.
(c) It suffices to prove that $λ(p↑e) ≠ λ(p↑{e+1})$
implies that $λ(p↑{e+1}) = pλ(p↑e) ≠ λ(p↑{e+2})$. Applying (a)
and (b), we know that $λ(p↑{e+2}) ≠ pλ(p↑e)$, and that $λ(p↑{e+1})$
is a divisor of $pλ(p↑e)$ but not of $λ(p↑e)$. Hence if $λ(p↑e)
= p↑fq$, where $q \mod p ≠ 0$, then $λ(p↑{e+1})$ must be $p↑{f+1}d$,
where $d$ is a divisor of $q$. But now $X↓{n+p↑{f+1}}
≡ X↓n \modulo {p↑e}$; hence $p↑{f+1}d$ is a multiple
of $p↑fq$, hence $d = q$.\xskip [{\sl Note:} The hypothesis $p↑e >2$
is necessary; for example, let $a↓1 = 4$, $a↓2 = -1$, $k = 2$; then
$\langle A↓n\rangle = 1$, 4, 15, 56, 209, 780, $\ldotss$; $λ(2)
= 2$, $λ(4) = 4$, $λ(8) = 4$.]
(d) $g(z) = X↓0 + (X↓1 - a↓1X↓0)z + \cdots$\par
\rjustline{$+ (X↓{k-1} - a↓1X↓{k-2} - a↓2X↓{k-3} - \cdots
- a↓{k-1}X↓0)z↑{k-1}$.}
(e) The derivation in (b) can be generalized
to the case $G(z) = g(z)/f(z)$; then the assumption of period
length $λ$ implies that $g(z)(1 - z↑λ) ≡ 0 \\{p↑e}$;
we treated only the special case $g(z)
= 1$ above. But both sides of this congruence can be multiplied
by Hensel's $b(z)$, and we obtain $1 - z↑λ ≡ 0 \\{p↑e}$.
{\sl Note:} A more ``elementary'' proof of the
result in (c) can be given without using generating functions,
using methods analogous to those in the answer to exercise 8:
If $A↓{λ+n} = A↓n + p↑eB↓n$, for $n = r, r + 1, \ldotss ,
r + k - 1$ and some integers $B↓n$, then this same relation
holds for {\sl all} $n ≥ r$ if we define $B↓{r+k}, B↓{r+k+1},
\ldots$ by the given recurrence relation. Since the resulting
sequence of $B$'s is some linear combination of shifts of the
sequence of $A$'s, we will have $B↓{λ+n} ≡ B↓n \modulo {p↑e}$
for all large enough values of $n$. Now $λ(p↑{e+1})$ must be
some multiple of $λ = λ(p↑e)$; for all large enough $n$ we have
$A↓{n+jλ} = A↓n + p↑e(B↓n + B↓{n+λ} + B↓{n+2λ} + \cdots
+ B↓{n+(j-1)λ}) ≡ A↓n + jp↑eB↓n \modulo {p↑{2e}}$ for
$j = 1, 2, 3, \ldots\,.$ No $k$ consecutive $B$'s are multiples
of $p$; hence $λ(p↑{e+1}) = pλ(p↑e) ≠ λ(p↑{e+2})$ follows immediately
when $e ≥ 2$. We still must prove that $λ(p↑{e+2}) ≠ pλ(p↑e)$
when $p$ is odd and $e = 1$; here we let $B↓{λ+n} = B↓n +
pC↓n$, and observe that $C↓{n+λ} ≡ C↓n \modulo {p}$ when
$n$ is large enough. Then $A↓{n+p} ≡ A↓n + p↑2\left( B↓n + {p\choose 2}
C↓n\right)\modulo {p↑3}$, and the proof is readily completed.
For the history of this problem, see Morgan Ward,
{\sl Trans. Amer. Math. Soc. \bf 35} (1933), 600--628; see
also D. W. Robinson, {\sl AMM \bf 73} (1966), 619--621.
%folio 666 galley 5 (C) Addison-Wesley 1978 *
\ansno 12. The period length mod 2 can be
at most 4; and the maximum period length mod $2↑{e+1}$ is at
most twice the maximum length mod $2↑e$, by the considerations
of the previous exercise. So the maximum conceivable period
length is $2↑{e+1}$; this is achievable, for example, in the
trivial case $a = 0$, $b = c = 1$.
\ansnos 13,14. Clearly $Z↓{n+λ} = Z↓n$, so $λ↑\prime$
is certainly a divisor of $λ$. Let the least common multiple
of $λ↑\prime$ and $λ↓1$ be $λ↑{\prime}↓{1}$, and define $λ↑{\prime}↓{2}$
similarly. $X↓n + Y↓n ≡ Z↓n ≡ Z↓{n+λ↓1↑\prime} ≡ X↓n +
Y↓{n+λ↓1↑\prime}$, so $λ↓1↑\prime$ is a multiple of $λ↓2$.
Similarly, $λ↑{\prime}↓{2}$
is a multiple of $λ↓1$. This yields the desired result. (The
result is ``best possible'' in the sense that sequences for
which $λ↑\prime = λ↓0$ can be constructed, as well as sequences
for which $λ↑\prime = λ$.)
\ansno 15. Algorithm M generates $(X↓{n+k}, Y↓n)$ in
step M1 and outputs $Z↓n = X↓{n+k-q↓n}$ in step M3, for
all sufficiently large $n$. Thus $\langle Z↓n\rangle$ has a period of length
$λ↑\prime $, where $λ↑\prime$ is the least positive integer such
that $X↓{n+k-q↓n} = X↓{n+λ↑\prime+k-q↓{n+λ↑\prime}}$
for all large $n$. Since $λ$ is a
multiple of $λ↓1$ and $λ↓2$, it follows that $λ↑\prime$ is a
divisor of $λ$. (These observations are due to Alan G. Waterman.)
We also have $n + k - q↓n ≡ n + λ↑\prime + k -
q↓{n+λ↑\prime}\modulo {λ↓1}$ for all large $n$, by the distinctness
of the $X$'s. The bound on $\langle q↓n\rangle$ implies that
$q↓{n+λ↑\prime} = q↓n + c$ for all large $n$, where $c ≡ λ↑\prime
\modulo {λ↓1}$ and $\leftv c\rightv < {1\over 2}λ↓1$. But $c$ must be
0 since $\langle q↓n\rangle$ is bounded. Hence $λ↑\prime ≡ 0
\modulo {λ↓1}$, and $q↓{n+λ↑\prime}= q↓n$ for all large $n$;
it follows that $λ↑\prime$ is a multiple of $λ↓2$ and $λ↓1$,
so $λ↑\prime = λ$.
{\sl Note:} The answer to exercise 3.2.1.2--4
implies that when $\langle Y↓n\rangle$ is a linear congruential
sequence of maximum period modulo $m = 2↑e$, the period length
$λ↓2$ will be at most $2↑{e-2}$ when $k$ is a power of 2.
\ansno 16. There are several methods of proof.
(1) Using the theory of finite fields.
In the field with $2↑k$ elements let $\xi$ satisfy $\xi ↑k =
a↓1\xi ↑{k-1} + \cdots + a↓k$. Let $f(b↓1\xi ↑{k-1} + \cdots
+ b↓k) = b↓k$, where each $b↓j$ is either zero or one; this
is a linear function. If word {\tt X} in the generation algorithm
is $(b↓1b↓2 \ldotsm b↓k)↓2$ before (10) is executed, and if $b↓1\xi
↑{k-1} + \cdots + b↓k\xi ↑0 = \xi ↑n$, then word {\tt X} represents
$\xi ↑{n+1}$ after (10) is executed. Hence the sequence is $f(\xi
↑n)$, $f(\xi ↑{n+1})$, $f(\xi ↑{n+2})$, $\ldotss$; and $f(\xi ↑{n+k})
= f(\xi ↑n\xi ↑k) = f(a↓1\xi ↑{n+k-1} + \cdots + a↓k\xi ↑n)
= a↓1f(\xi ↑{n+k-1}) + \cdots + a↓kf(\xi ↑n)$.
(2) Using brute force, or elementary ingenuity.
We are given a sequence $X↓{nj}$, $n ≥ 0$, $1 ≤ j ≤ k$, satisfying
$$X↓{(n+1)j} ≡ X↓{n(j+1)} + a↓jX↓{n1},\qquad 1 ≤ j < k;\qquad
X↓{(n+1)k} ≡ a↓kX↓{n1}\quad \modulo 2.$$
We must show that this implies $X↓{nk} ≡ a↓1X↓{(n-1)k}
+ \cdots + a↓kX↓{(n-k)k}$, for $n ≥ k$. Indeed, it implies
$X↓{nj} ≡ a↓1X↓{(n-1)j} + \cdots + a↓kX↓{(n-k)j}$ when $1 ≤
j ≤ k ≤ n$. This is clear for $j = 1$, since $X↓{n1} ≡ a↓1X↓{(n-1)1}
+ X↓{(n-1)2} ≡ a↓1X↓{(n-1)1} + a↓2X↓{(n-2)2} + X↓{(n-2)3}$,
etc. For $j > 1$, we have by induction
$$\eqalign{X↓{nj} ⊗≡ X↓{(n+1)(j-1)} - a↓{j-1}X↓{n1}\cr
\noalign{\vskip 6pt}
⊗≡ \sum ↓{1
≤i≤k} a↓iX↓{(n+1-i)(j-1)} - a↓{j-1} \sum ↓{1≤i≤k} a↓iX↓{(n-i)1}\cr
⊗≡ \sum ↓{1≤i≤k} a↓i(X↓{(n+1-i)(j-1)} - a↓{j-1}X↓{(n-i)1})\cr
⊗≡ a↓1X↓{(n-1)j} + \cdots + a↓kX↓{(n-k)j}.\cr}$$
This proof does {\sl not} depend on the fact that operations
were done modulo 2, or modulo any prime number.
\ansno 17. (a) When the sequence terminates, the $(k
- 1)$-tuple $(X↓{n+1}, \ldotss, X↓{n+k-1})$ occurs for the $(m
+ 1)$st time. A given $(k - 1)$-tuple $(X↓{r+1}, \ldotss, X↓{r+k-1})$
can have only $m$ distinct predecessors $X↓r$, so one of these
occurrences must be for $r = 0$. (b) Since the $(k - 1)$-tuple
$(0, \ldotss , 0)$ occurs $(m + 1)$ times, each possible predecessor
appears, so the $k$-tuple $(a↓1, 0, \ldotss , 0)$ appears for
all $a↓1, 0 ≤ a↓1 < m$. Let $1 ≤ s < k$ and suppose we have
proved that all $k$-tuples $(a↓1, \ldotss , a↓s, 0, \ldotss ,
0)$ appear in the sequence when $a↓s ≠ 0$. By the construction,
this $k$-tuple would not be in the sequence unless $(a↓1, \ldotss
, a↓s, 0, \ldotss , 0, y)$ had appeared earlier for $1 ≤ y <
m$. Hence the $(k - 1)$-tuple $(a↓1, \ldotss , a↓s, 0, \ldotss
, 0)$ has appeared $m$ times, and all $m$ possible predecessors
appear; this means $(a, a↓1, \ldotss , a↓s, 0, \ldotss , 0)$ appears
for $0 ≤ a < m$. The proof is now complete by induction.
The result also follows from Theorem 2.3.4.2D,
using the directed graph of exercise 2.3.4.2--23; the set of
arcs from $(x↓1, \ldotss , x↓j, 0, \ldotss , 0)$ to $(x↓2, \ldotss
, x↓j, 0, \ldotss , 0, 0)$, where $x↓j ≠ 0$ and $1 ≤ j ≤ k$,
forms an oriented subtree related neatly to Dewey decimal notation.
\ansno 18. The third-most-significant bit of $U↓{n+1}$
is completely determined by the first and third bits of $U↓n$,
so only 32 of the 64 possible pairs $(\lfloor 8U↓n\rfloor ,
\lfloor 8U↓{n+1}\rfloor )$ occur. ({\sl Notes:} If we had
used, say, 11-bit numbers $U↓n = (.X↓{11n}X↓{11n+1} \ldotsm X↓{11n+10})↓2$,
the sequence {\sl would} be satisfactory for many applications.
If another constant appears in {\tt A} having more ``one'' bits,
the generalized spectral test might give some indication of
its suitability. See exercise 3.3.4--2; we could examine $\nu
↓t$ in dimensions $t = 36, 37, 38, \ldotss\,$.)
\ansno 21. [{\sl J. London Math. Soc. \bf 21} (1946),
169--172.] Any sequence of period length $m↑k - 1$ with no $k$
consecutive zeroes leads to a sequence of period length $m↑k$
by inserting a zero in the appropriate place, as in exercise
7; conversely, we can start with a sequence of period length
$m↑k$ and delete an appropriate zero from the period, to form
a sequence of the other type. Let us call these ``$(m, k)$ sequences''
of types A and B. The hypothesis assures us of the existence
of $(p, k)$ sequences of type A, for all primes $p$ and all
$k ≥ 1$; hence we have $(p, k)$ sequences of type B for all
such $p$ and $k$.
To get a $(p↑e, k)$ sequence of type B, let
$e = qr$, where $q$ is a power of $p$ and $r$ is not a multiple
of $p$. Start with a $(p, qrk)$ sequence of type A, namely
$X↓0, X↓1, X↓2, \ldotss $; then (using the $p$-ary number system)
the grouped digits $(X↓0 \ldotsm X↓{q-1})↓p, (X↓q \ldotsm X↓{2q-1})↓p,
\ldots$ form a $(p↑q, rk)$ sequence of
type A, since $q$ is relatively prime to $p↑{qrk} - 1$ and
the sequence therefore has a period length of $p↑{qrk} - 1$.
This leads to a $(p↑q, rk)$ sequence $\langle Y↓n\rangle$ of
type B; and $(Y↓0Y↓1 \ldotsm Y↓{r-1})↓{p↑q}$, $(Y↓rY↓{r+1}
\ldotsm Y↓{2r-1})↓{p↑q}$, $\ldots$ is a $(p↑{qr}, k)$ sequence
of type B by a similar argument, since $r$ is relatively prime
to $p↑{qk}$.
To get an $(m, k)$ sequence of type B for arbitrary
$m$, we can combine $(p↑e, k)$ sequences for each of the prime
power factors of $m$ using the Chinese remainder theorem; but
a simpler method is available. Let $\langle X↓n\rangle$ be an
$(r, k)$ sequence of type B, and let $\langle Y↓n\rangle$
be an $(s, k)$ sequence of type B, where $r$ and $s$ are relatively
prime; then $\langle sX↓n + Y↓n\rangle$ is an $(rs, k)$ sequence
of type B.
A simple, uniform construction which yields $(2,
k)$ sequences for arbitrary $k$ has been discovered by A. Lempel
[{\sl IEEE Trans.\ \bf C--19} (1970), 1204--1209].
\ansno 22. By the Chinese remainder theorem, we can find constants
$a↓1, \ldotss , a↓k$ having desired residues mod each prime
divisor of $m$. If $m = p↓1p↓2 \ldotsm p↓t$, the period
length will be lcm$(p↑{k}↓{1} - 1, \ldotss , p↑{k}↓{t}
- 1)$. In fact, we can achieve reasonably long periods for arbitrary
$m$ (not necessarily squarefree), as shown in exercise 11.
\ansno 23. Period length at least $2↑{55} - 1$; possibly
faster than (7), see exercise 3.2.1.1--5. Furthermore, R. Brent
has pointed out that the calculations can be done exactly on
floating-point numbers in $[\,0, 1)$.
\ansno 24. Run the sequence backwards; i.e., if $Z↓n
= Y↓{-n}$ we have $Z↓n = (Z↓{n-m+k} + Z↓{n-m})\mod 2$.
\ansno 25. This actually would be slower and more complicated, unless it
can be used to save subroutine-calling overhead in high-level languages.
(See the {\tt FORTRAN} program in Section 3.6.)
\ansno 27. Let $J↓n=\lfloor k X↓n/m\rfloor$.\xskip {\sl Lemma:} After the
$(k↑2+7k-2)/2$ consecutive values
$$0↑{k+2}\ 1\ 0↑{k+1}\ 2\ 0↑k\ \ldots\ (k-1)\ 0↑3$$
occur in the $\langle J↓n \rangle$ sequence, Algorithm B will have $V[i]
< m/k$ for $0≤j<k$, and also $Y<m/k$.\xskip {\sl Proof.} Let $S↓n$ be the
set of positions $i$ such that $V[i]<m/k$ just before $X↓n$ is generated,
and let $j↓n$ be the index such that $V[j↓n]←X↓n$. If $j↓n \notin S↓n$ and
$J↓n=0$, then $S↓{n+1}=S↓n ∪ \{j↓n\}$; if $j↓n \in S↓n$ and $J↓n=0$, then
$S↓{n+1}=S↓n$ and $j↓{n+1}=0$. After $k+2$ successive 0's, we must therefore
have $0\in S↓n$ and $j↓{n+1}=0$. Then after ``$1\ 0↑{k+1}$'' we must have
$\{0,1\} \subset S↓n$ and $j↓{n+1}=0$; after ``$2\ 0↑k$'' we must have
$\{0,1,2\} \subset S↓n$ and $j↓{n+1}=0$; and so on.
{\sl Corollary:} For $λ≥2(k↑2+7k-2) k↑{(k↑2+7k-2)/2}$, either Algorithm B
yields a period of length $λ$ or the sequence $\langle X↓n\rangle$ is poorly
distributed.\xskip {\sl Proof.} The probability that
any given length-$l$ pattern of $J$'s does not occur in a random sequence of
length $λ$ is less than $(1-k↑{-l})↑{λ/l}<\exp(-k↑{-l}λ/l)$. For $l=(k↑2+7k-2)/2$
this is at most $e↑{-4}$; hence the stated pattern should appear. After it
does, the subsequent behavior of Algorithm B will be the same each time it
reaches this part of the period.
\ansbegin{3.3.1}
\ansno 1. There are $k = 11$
categories, so the line $\nu = 10$ should be used.
\ansno 2. ${2\over 49}$, ${3\over 49}$, ${4\over 49}$,
${5\over 49}$, ${6\over 49}$, ${9\over 49}$, ${6\over 49}$,
${5\over 49}$, ${4\over 49}$, ${3\over 49}$, ${2\over 49}$.
\ansno 3. $V = 7{173\over 240}$, only very slightly higher
than that obtained from the good dice! There are two reasons
why we do not detect the weighting: (a) The new probabilities
(cf.\ exercise 2) are not really very far from the old ones in
Eq. (1). The sum of the two dice tends to smooth out the probabilities;
if we considered instead each of the 36 possible pairs of values,
and counted these, we would probably detect the difference quite
rapidly (assuming that the two dice are distinguishable).\xskip (b) A far
more important reason is that $n$ is too small for a significant
difference to be detected. If the same experiment is done for
large enough $n$, the faulty dice will be discovered (see exercise
12).
\ansno 4. $p↓s = {1\over 12}$ for $2 ≤ s ≤ 12$ and $s
≠ 7$; $p↓7 = {1\over 6}$. The value of $V$ is 16${1\over 2}$,
which falls between the $75\%$ and $95\%$ entries in Table 1;
so it is reasonable, in spite of the fact that not too
many sevens actually turned up.
\ansno 5. $K↑{+}↓{20} = 1.15; K↑{-}↓{20} = 0.215$; these
do not differ significantly from random behavior (being at about
the $94\%$ and $86\%$ levels), but they are mighty close.
(The data values in this exercise come from Appendix A, Table
1.)
\ansno 6. The probability that $X↓j ≤ x$ is $F(x)$, so
we have the binomial distribution
discussed in Section 1.2.10. $F↓n(x) = s/n$ with probability
${n\choose s}F(x)↑s\biglp 1 - F(x)\bigrp ↑{n-s}$;
the mean is $F(x)$; the standard deviation is $\sqrt{F(x)\biglp
1 - F(x)\bigrp /n}$. (Cf.\ Eq. 1.2.10--19. This suggests that
a slightly better statistic would be to define
$$K↑{+}↓{n} = \sqrt n\max↓{-∞<x<∞}\biglp F↓n(x)
- F(x)\bigrp /\sqrt{F(x)\biglp 1 - F(x)\bigrp},$$
etc.) {\sl Notes:} Similarly, we can calculate
the mean and standard deviation of $F↓n(x) - F↓n(y)$, for $x
< y$, and obtain the covariance of $F↓n(x), F↓n(y)$. Using these
facts, it can be shown that for large values of $n$ the function
$F↓n(x)$ behaves as a ``Brownian motion,'' and techniques from
this branch of probability theory may be used to study it. The
situation is exploited in articles by J. L. Doob and M. D. Donsker,
{\sl Annals Math.\ Stat.\ \bf 20} (1949), 393--403
and {\bf 23} (1952), 277--281; this is generally regarded as the most
enlightening way to study the KS tests.
%folio 670 galley 6 (C) Addison-Wesley 1978 *
\ansno 7. $\biglp$(Cf.\ Eq. (13).$\bigrp$ Take
$j = n$ to see that $K↑{+}↓{n}$ is never negative and it can
get as high as $\sqrt{n}$. Similarly, take $j = 1$ to make the
same observations about $K↑{-}↓{n}$.
\ansno 8. The new KS statistic was computed for 20 observations.
The distribution of $K↑{+}↓{10}$ was used as $F(x)$ when the
KS statistic was computed.
\ansno 9. The idea is erroneous, because all of the
observations must be {\sl independent.} There is a relation
between the statistics $K↑{+}↓{n}$ and $K↑{-}↓{n}$ on the same
data, so each test should be judged separately. (A high value
of one tends to give a low value of the other.) Similarly, the
entries in Figs.\ 2 and 5 (which show 15 tests for each generator)
do not show 15 independent observations, because the ``maximum
of 5'' test is not independent of the ``maximum of 4'' test.
The three tests of each horizontal row are independent (because
they were done on different parts of the sequence), but the
five tests in a column are somewhat correlated. The net effect
of this that the 95-percent probability levels, etc., which
apply to one test, cannot legitimately be applied to a whole
group of tests on the same data. Moral: When testing a random-number
generator, we may expect it to ``pass'' each of several tests,
e.g., the frequency test, maximum test, run test, etc.; but
an array of data from several different tests should not be
considered as a unit since the tests themselves may not be independent.
The $K↑{+}↓{n}$ and $K↑{-}↓{n}$ statistics should be considered
as two separate tests; a good source of random numbers will
pass both tests.
\ansno 10. Each $Y↓s$ is doubled, and $np↓s$ is doubled,
so the numerators of (6) are quadrupled while the denominators
only double. Hence the new value of $V$ is twice as high as
the old one.
\ansno 11. The empirical distribution function stays
the same; the values of $K↑{+}↓{n}$ and $K↑{-}↓{n}$ are multiplied
by $\sqrt{2}$.
\ansno 12. Let $Z↓s = (Y↓s - nq↓s)/\sqrt{nq↓s}$. The
value of $V$ is $n$ times
$$\sum ↓{1≤s≤k}(q↓s - p↓s + \sqrt{q↓s/n}Z↓s)↑2/p↓s,$$
and the latter quantity stays bounded away from
zero as $n$ increases (since $Z↓sn↑{-1/4}$ is bounded
with probability 1). Hence the value of $V$ will increase to
a value which is extremely improbable under the $p↓s$ assumption.
For the KS test, let $F(x)$ be the assumed distribution,
$G(x)$ the actual distribution, and let $h = \max \left|G(x) - F(x)\right|$.
Take $n$ large enough so that $\left|F↓n(x)
- G(x)\right| > h/2$ occurs with very small probability; then
$\left|F↓n(x) - F(x)\right|$ will be
improbably high under the assumed distribution $F(x)$.
\ansno 13. (The ``max'' notation should really be replaced
by ``sup'' since a least upper bound is meant; however, ``max''
was used in the text to avoid confusing too many readers by
the less familiar ``sup'' notation.) For convenience, let $X↓0
= -∞$, $X↓{n+1} = +∞$. When $X↓j ≤ x < X↓{j+1}$, we have $F↓n(x)
= j/n$, and in this interval $\max\biglp F↓n(x) - F(x)\bigrp
= j/n - F(X↓j)$; $\max\biglp F(x) - F↓n(x)\bigrp = F(X↓{j+1})
- j/n$. For $0 ≤ j ≤ n$, all real values of $x$ are considered;
this proves that
$$\eqalign{
K↑{+}↓{n}⊗=\sqrt n \max↓{0≤j≤n}\left({j\over n} - F(X↓j)\right);\cr
K↑{-}↓{n}⊗=\sqrt n \max↓{1≤j≤n+1}\left(F(X↓j) - {j - 1\over n}\right).\cr}$$
These are equivalent to (13), since the extra term under
the maximum signs is nonpositive and it must be redundant by
exercise 7.
\ansno 14. The logarithm of the left-hand side simplifies
to
$$\halign{\hjust to size{#}\cr
$\dispstyle-\sum ↓{1≤s≤k}Y↓s \ln\left(1 + {Z↓s\over
\sqrt{np↓s}}\right) + {1 - k\over 2}\ln(2πn)$\hfill\cr
\hfill$\dispstyle - {1\over 2}\sum ↓{1≤s≤k}\ln p↓s - {1\over 2}
\sum ↓{1≤s≤k}\ln\left(1 + {Z↓s\over \sqrt{np↓s}}\right)+O\left(1\over n\right)$\cr
}$$
and this quantity simplifies further (upon expanding $\ln(1
+Z↓s/\sqrt{np↓s})$ and realizing that $\sum ↓{1≤s≤k}
Z↓s\sqrt{np↓s} = 0$) to
$$-{1\over 2} \sum ↓{1≤s≤k} Z↑{2}↓{s} + {1 - k\over 2}
\ln(2πn) - {1\over 2} \ln(p↓1 \ldotsm p↓k) + O\left(1\over
\sqrt{n}\right).$$
\ansno 15. The corresponding Jacobian determinant
is easily evaluated by (a) removing the factor $r↑{n-1}$ from
the determinant, (b) expanding the resulting determinant by
the cofactors of the row containing ``$\cos \theta ↓1$ $- \sin
\theta ↓1$ $0 \ldotsm 0$'' (each of the cofactor determinants
may be evaluated by induction), and (c) recalling that $\sin↑2
\theta ↓1 + \cos↑2 \theta ↓1 = 1$.
\ansno 16. $\dispstyle \int ↑{z\sqrt{2x}+y}↓{0}\hskip-3pt\exp\left(- {u↑2\over 2x}
+\cdots\right)\,du = ye↑{-z↑2} + O\left(1\over
\sqrt{x}\right) + \int ↑{z\sqrt{2x}}↓{0}\hskip-3pt\exp\left(-{u↑2\over
2x}+\cdots\right)\,du.$
\vskip 11pt plus 3pt minus 8pt
\noindent The latter integral is
$$\int ↑{z\sqrt{2x}}↓{0}e↑{-u↑2/2x}\,du + {1\over 3x↑2} \int
↑{z\sqrt{2x}}↓{0}e↑{-u↑2/2x}u↑3\,du + O\left(1\over
\sqrt{x}\right).$$
When all is put together, the final result is
$${\gamma (x + 1, x + z\sqrt{2x} + y)\over\Gamma
(x + 1)} = {1\over \sqrt{2π}} \int ↑{z\sqrt2}↓{-∞}e↑{-u↑2/2}
\,du + {e↑{-z↑2}\over\sqrt{2πx}}{\textstyle(y
- {2\over 3} - {2\over 3}z↑2)} + O\left(1\over \sqrt x\right).$$
If we set $z\sqrt{2} = x↓p$ and write
$${1\over \sqrt{2π}} \int ↑{z\sqrt2}↓{-∞}e↑{-u↑2/2}
\,du = p,\qquad x + 1 = {\nu \over 2} ,\qquad \gamma \left({\nu
\over 2} , {t\over 2}\right)\left/\Gamma \left(\nu \over 2\right)\right.
= p,$$
where $t/2 = x + z\sqrt{2x} + y$, we can solve
for $y$ to obtain $y = {2\over 3}(1 + z↑2) + O(1/\sqrt{x}\,)$,
which is consistent with the above analysis. The solution is
therefore $t = \nu + 2\sqrt{\nu }z + {4\over 3}z↑2 - {2\over
3} + O(1/\sqrt{\nu }\,)$.
\ansno 17. (a) Change of variable, $x↓j ← x↓j + t$.\xskip (b)
Induction on $n$; by definition,
$$P↓{n0}(x - t) = \int ↑{x}↓{n}P↓{(n-1)0}(x↓n - t)\,dx↓n.$$
(c) The left-hand side is
$$\int ↑{x+t}↓{n}\,dx↓n\; \ldots \int ↑{x↓{k+2}}↓{k+1}\,dx↓{k+1}\qquad
\hjust{times}\qquad \int ↑{k}↓{t}\,dx↓k \int ↑{x↓k}↓{t}\,dx↓{k-1}\; \ldots
\int ↑{x↓2}↓{t}\,dx↓1.$$
(d) From (b), (c) we have
$$P↓{nk}(x) = \sum ↓{0≤r≤k} {(r - t)↑r\over r!} {(x + t - r)↑{n-r-1}\over
(n - r)!} (x + t - n).$$
The numerator in (24) is $P↓{n\lfloor t\rfloor}(n)$.
\ansno 18. We may assume that $F(x) = x$ for $0 ≤ x ≤
1$, as remarked in the text's derivation of (24). If $0 ≤ X↓1
≤ \cdots ≤ X↓n ≤ 1$, let $Z↓j = 1 - X↓{n+1-j}$. We have $0
≤ Z↓1 ≤ \cdots ≤ Z↓n ≤ 1$; and $K↑{+}↓{n}$ evaluated for $X↓1,
\ldots , X↓n$ equals $K↑{-}↓{n}$ evaluated for $Z↓1, \ldots
, Z↓n$. This symmetrical relation gives a one-to-one correspondence
between sets of equal volume for which $K↑{+}↓{n}$ and $K↑{-}↓{n}$
fall in a given range.
\ansno 23. Let $m$ be any number $≥n$.\xskip (a)
If $\lfloor mF(X↓i)\rfloor = \lfloor mF(X↓j)\rfloor$ and $i
> j$ then $i/n - F(X↓i) > j/n - F(X↓j)$.\xskip (b) Start with $a↓k
= 1.0$, $b↓k = 0.0$, $c↓k = 0$ for $0 ≤ k < m$. Then do the following for
each observation $X↓j$: Set $Y ← F(X↓j)$, $k ← \lfloor mY\rfloor$,
$a↓k ← \min(a↓k, Y)$, $b↓k ← \max(b↓k, Y)$, $c↓k ← c↓k + 1$. [Assume
that $F(X↓j) < 1$ so that $k < m$.] Then set $j ← 0$, $r↑+ ← r↑-
← 0$, and for $k = 0$, 1, $\ldotss$, $m - 1$ (in this order) do
the following whenever $c↓k > 0$: Set $r↑- ← \max(r↑-, a↓k
- j/n)$, $j ← j + c↓k$, $r↑+ ← \max(r↑+, j/n - b↓k)$. Finally set
$K↑{+}↓{n} ← \sqrt{n}r↑+$, $K↑{-}↓{n} ← \sqrt{n}r↑-$. The time
required is $O(m + n)$, and the precise value of $n$ need not
be known in advance. (If the estimate $(k + {1\over 2})/m$ is
used for $a↓k$ and $b↓k$, so that only the values $c↓k$ are actually
computed for each $k$, we obtain estimates of $K↑{+}↓{n}$ and
$K↑{-}↓{n}$ good to within ${1\over2}\sqrt{n}/m$, even when $m < n$.)
[{\sl ACM Trans.\ Math.\ Software \bf 3} (1977), 60--64.]
%folio 674 galley 7 (C) Addison-Wesley 1978 *
\ansbegin{3.3.2}
\ansno 1. The observations for
a chi-square test must be independent, and in the second sequence
successive observations are manifestly dependent, since the
second component of one equals the first component of the next.
\ansno 2. Form $t$-tuples $(Y↓{jt}, \ldotss, Y↓{jt+t-1})$, for
$0 ≤ j < n$, and count how many of these equal each possible
value. Apply the chi-square test with $k = d↑t$ and with probability
$1/d↑t$ in each category. The number of observations, $n$, should
be at least $5d↑t$.
\ansno 3. The probability that $j$ values are examined,
i.e., the probability that $U↓{j-1}$ is the $n$th element of
the sequence lying in the range $α ≤U↓{j-1} < β$, is easily
seen to be
$${j-1\choose n-1}p↑n(1 - p)↑{j-n},$$
by enumeration of the possible places in which
the other $n - 1$ occurrences can appear and by evaluating the
probability of such a pattern. The generating function is $G(z)
= \biglp pz/(1 - (1 - p)z)\bigrp ↑n$, which makes sense since
the given distribution is the $n$-fold convolution of the same
thing for $n = 1$. Hence the mean and variance are proportional
to $n$; the number of $U$'s to be examined is now easily found
to have the characteristics $\biglp$min\penalty200\ $n$, ave $n/p$, max $∞$,
dev $\sqrt{n(1 - p)/p}\bigrp$. A more detailed discussion of
this probability distribution when $n = 1$ may be found in the
answer to exercise 3.4.1--17; see also the considerably more
general results of exercise 2.3.4.2--26.
\ansno 4. The probability of a gap of length $≥r$ is
the probability that $r$ consecutive $U$'s lie outside the given
range, i.e., $(1 - p)↑r$. The probability of a gap of length
exactly $r$ is the above value for length $≥r$ minus the value
for length $≥(r + 1)$.
\ansno 5. As $N$ goes to infinity, so does $n$ (with
probability 1), hence this test is just the same as the gap
test described in the text except for the length of the very
last gap. And the text's gap test certainly is asymptotic to
the chi-square distribution stated, since the length of each
gap is independent of the length of the others. [{\sl
Notes:} A quite complicated proof of this result by E. Bofinger
and V. J. Bofinger appears in {\sl Annals Math.\ Stat.\
\bf 32} (1961), 524--534. Their paper is noteworthy because
it discusses several interesting variations of the gap test;
they show, for example, that the quantity
$$\sum ↓{0≤r≤t} {\biglp Y↓r - (Np)p↓r\bigrp ↑2\over (Np)p↓r}$$
does {\sl not} approach a chi-square distribution,
although others had suggested this statistic as a ``stronger''
test because $Np$ is the expected value of $n$.]
\ansno 7. 5, 3, 5, 6, 5, 5, 4.
\ansno 8. See exercise 10, with $w = d$.
\ansno 9. (Change $d$ to $w$ in steps C1 and C4.) We
have
$$\eqalign{p↓r⊗ = {d(d - 1)\ldotsm(d - w + 1)\over d↑r}{r-1\comb\{\}
w - 1}\qquad \hjust{for }w ≤ r < t;\cr
\noalign{\vskip 3pt}
p↓t⊗= 1 - {d!\over d↑{t-1}}\left({1\over 0!}{t-1\comb\{\}
d} + \cdots + {1\over (d - w)!} {t-1\comb\{\}w}\right).\cr}$$
\ansno 10. As in exercise 3, we really need consider
only the case $n = 1$. The generating function for the probability
that a coupon set has length $r$ is
$$G(z) = {d!\over (d - w)!} \sum ↓{r>0}{r-1\comb\{\} w-1}\left( z\over d\right)↑r
= z↑w\left(d-1\over d-z\right)\ldotsm\left(d-w+1\over d-(w-1)z\right)$$
by the previous exercise and Eq. 1.2.9--28. The mean and variance are readily
computed using Theorem 1.2.10A and exercise 3.4.1--17. We find
that
$$\eqalign{\hjust{mean}(G)⊗= w + \left({d\over d - 1} - 1\right)+\cdots+
\left({d\over d-w+1}-1\right)=d(H↓d-H↓{d-w})=\mu;\cr
\hjust{var}(G) ⊗= d↑2(H↑{(2)}↓{d} - H↑{(2)}↓{d-w})
- d(H↓d-H↓{d-w}) = \sigma ↑2.\cr}$$
The number of $U$'s examined, as the search for a coupon
set is repeated $n$ times, therefore has the characteristics
$\biglp$min $wn$, ave $\mu n$, max $∞$, dev $\sigma \sqrt{n}\bigrp$.
\def\0{\hjust to 7pt{}}
\def\\{\hjust to 7pt{\hfill\vrule height 9.944pt depth 3pt\hfill}}
\ansno 11. \\1\\2\\9\08\05\03\\6\\7\00\\4\\.
\ansno 12. {\bf Algorithm R} ({\sl Data for run test\/}){\bf.}
\algstep R1. [Initialize.] Set $j
← -1$, and set {\tt$\hjust{COUNT}[1] ← \hjust{COUNT}[2] ← \cdots ←
\hjust{COUNT}[6] ← 0$}. Also set $U↓n ← U↓{n-1}$, for convenience in terminating the
algorithm.
\algstep R2. [Set $r$ zero.] Set $r ← 0$.
\algstep R3. [Is $U↓j < U↓{j+1}?]$
Increase $r$ and $j$ by 1. If $U↓j < U↓{j+1}$, repeat this step.
\algstep R4. [Record the length.] If $r ≥
6$, increase {\tt COUNT}[6] by one, otherwise increase {\tt COUNT}$[r]$
by one.
\algstep R5. [Done?] If $j < n - 1$, return
to step R$↓2$.\quad\blackslug
\def\\{\mathrel{\vcenter{\hjust{>}\vskip-4pt\hjust{<}}}}
\ansno 13. There are $(p +
q + 1){p+q\choose p}$ ways to have $U↓{i-1} \\ U↓i < \cdots
< U↓{i+p-1}\\ U↓{i+p} < \cdots < U↓{i+p+q-1}$; subtract
$p+q+1\choose p+1$ of these in which $U↓{i-1} < U↓i$, and
subtract $p+q+1\choose 1$ for those in which $U↓{i+p-1} <
U↓{i+p}$; then add in 1 for the case that both $U↓{i-1} < U↓i$
and $U↓{i+p-1} < U↓{i+p}$, since this case has been subtracted
out twice. (This is a special case of the ``inclusion-exclusion''
principle, which is explained further in Section 1.3.3.)
\ansno 14. A run of length $r$ occurs with probability
$1/r! - 1/(r + 1)!$, assuming distinct $U$'s.
\ansno 15. This is always true of $F(X)$ when $F$ is
continuous and $S$ has distribution $F;$ see Section 3.3.1C.
\ansno 16. (a) $Z↓{jt} = \max(Z↓{j(t-1)}, Z↓{(j+1)(t-1)})$.
If the $Z↓{j(t-1)}$ are stored in memory, it is therefore a
simple matter to transform this array into the set of $Z↓{jt}$
with no auxiliary storage required.\xskip (b) With his ``improvement,''
each of the $V$'s should indeed have the stated distribution,
but the observations are no longer independent. In fact, when
$U↓j$ is a relatively large value, all of $Z↓{jt}$, $Z↓{(j-1)t}$,
$\ldotss$, $Z↓{(j-t+1)t}$ will be equal to $U↓j$; so we almost
have the effect of repeating the same data $t$ times (and that
would multiply $V$ by $t$, cf.\ exercise 3.3.1--10).
\ansno 17. (b) By Lagrange's identity, the difference
is $\sum ↓{0≤k<j<n}(U↑{\prime}↓{k}V↑{\prime}↓{j}
- U↑{\prime}↓{j}V↑{\prime}↓{k})↑2$ and this is certainly positive.\xskip
(c) Therefore if $D↑2 = N↑2$, we must have $U↑{\prime}↓{k}V↑{\prime}↓{j}
- U↑{\prime}↓{j}V↑{\prime}↓{k} = 0$, for all pairs $j, k$. This
means that the matrix
$$\left(\vcenter{\halign{\ctr{$#↑\prime↓1$}⊗ \ctr{$#↑\prime↓2$}$\ldotsm$⊗\ctr
{$#↑\prime↓{n-1}$}\cr
U⊗U⊗U\cr V⊗V⊗V\cr}}\right)$$
has rank $<2$, so its rows are linearly dependent.
(A more elementary proof can be given, using the fact that $U↑\prime↓0V↑\prime↓j
- U↑{\prime}↓{j}V↑{\prime}↓{0} = 0$ for $ 1 ≤ j < n$ implies the
existence of constants $α, β$ such that $αU↑{\prime}↓{j} + βV↑{\prime}↓{j}
= 0$ for all $j$, provided that $U↑{\prime}↓{0}$ and $V↑{\prime}↓{0}$
are not both zero; the latter case can be avoided by a suitable
renumbering.)
\ansno 18. (a) The numerator is $-(U↓0 - U↓1)↑2$, the
denominator is $(U↓0 - U↓1)↑2$.\xskip (b) The numerator is $-(U↑{2}↓{0}
+ U↑{2}↓{1} + U↑{2}↓{2} - U↓0U↓1 - U↓1U↓2 - U↓2U↓0)$; the denominator
is $2(U↑{2}↓{0} + \cdots - U↓2U↓0)$.\xskip (c) The denominator always
equals $\sum ↓{0≤j<k<n}(U↓j - U↓k)↑2$, by exercise
1.2.3--30 or 1.2.3--31.
\ansno 21. The successive values of $c↓{r-1}=s-1$ in step P2 are
2, 3, 7, 6, 4, 2, 2, 1, 0; hence $f=886862$.
\ansno 22. $1024=6!+2\cdot5!+2\cdot4!+2\cdot3!+2\cdot2!+0\cdot1!$, so we
want the successive values of $s-1$ in step P2 to be 0, 0, 0, 1, 2, 2, 2, 2, 0;
working backwards, the permutation is $(9,6,5,2,3,4,0,1,7,8)$.
\ansbegin{3.3.3}
\ansno 1. $y((x/y))
+ {1\over 2}y - {1\over 2}y\delta (x/y)$.
\ansno 2. See exercises 1.2.4--38 and 1.2.4--39(a), (b), (g).
\ansno 3. $f(x) = \sum ↓{n≥1}(-\sin 2πnx)/n$, which
converges for all $x$. (The representation in Eq.\ (24) may be
considered a ``finite'' Fourier series, for the case when $x$
is rational.)
\ansno 4. $d = 2↑{10}\cdot 5$. Note that we have $X↓{n+1}
< X↓n$ with probability ${1\over 2} + \epsilon $, where
$$|\epsilon | < d/(2\cdot10↑10) = 1/(2\cdot5↑9);$$
hence {\sl every} potency-10 generator is respectable
from the standpoint of Theorem P.
\ansno 5. An intermediate result:
$$\sum ↓{0≤x<m} {x\over m} {s(x)\over m} = {1\over 12} \sigma
(a, m, c) + {m\over 4} - {c\over 2m} - {x↑\prime \over 2m}.$$
%folio 678 galley 8 (not on paper tape) (C) Addison-Wesley 1978 *
\ansno 6. (a) Use induction and the formula
$$\left(\left(hj+c\over k\right)\right)-\left(\left(hj+c-1\over k\right)\right)
={1\over k}-{1\over 2}\delta\left(hj+c\over k\right)-{1\over2}\delta\left(
hj+c-1\over k\right).$$
(b) Use the fact that $\dispstyle-\left(\left(h↑\prime j\over k\right)\right)
=-\left(\left({j\over hk}-{k↑\prime j\over h}\right)\right)=
\left(\left(k↑\prime j\over h\right)\right)-{j\over hk} + {1\over2}\delta
\left(k↑\prime j\over h\right)$.
\ansno 7. Take $m=h$, $n=k$, $k=2$ in the second formula of exercise 1.2.4--45:
$$\twoline{\sum↓{0<j<k}\biggglp\hskip-2pt
{hj\over k}-\left(\left(hj\over k\right)\right)+
{1\over 2}\hskip-2pt\bigggrp
\biggglp\hskip-2pt {hj\over k}-\left(\left(hj\over k\right)\right)-
{1\over 2}\hskip-2pt\bigggrp+2\<\sum↓{0<j<h}
\biggglp\hskip-2pt {kj\over h}-\left(\left(kj\over h\right)
\right)+{1\over 2}\bigggrp j}{-4pt}{=kh(h-1).\hskip-9pt}$$
The sums on the left simplify, and by standard manipulations we get
$$h↑2k-hk-{h\over2}+{h↑2\over6k}+{k\over12}+{1\over4}-{h\over6}\sigma(h,k,0)
-{h\over6}\sigma(k,h,0)+{1\over12}\sigma(1,k,0)=h↑2k-hk.$$
Since $\sigma(1,k,0)=(k-1)(k-2)/k$, this reduces to the reciprocity law.
\ansno 8. See {\sl Duke Math.\ J. \bf21} (1954), 391--397.
\ansno 9. Begin with the handy identity $\sum↓{0≤k<r}\lfloor kp/r\rfloor\lfloor kq/r
\rfloor+\sum↓{0≤k<p}\lfloor kq/p\rfloor\lfloor kr/p\rfloor+\sum↓{0≤k<q}
\lfloor kr/q\rfloor\lfloor kp/q\rfloor = (p-1)(q-1)(r-1)$ for which a simple
geometric proof is possible. [U. Dieter, {\sl Abh.\ Math.\ Sem.\ Univ.\
Hamburg \bf21} (1957), 109--125.]
\ansno 10. Obviously $\sigma(k-h,k,c)=-\sigma(h,k,-c)$, cf.\ (8). Replace $j$
by $k-j$ in definition (16), to deduce that $\sigma(h,k,c)=\sigma(h,k,-c)$.
\vskip 12pt plus 3pt minus 9pt
\ansno 11. (a)$\dispstyle\sum↓{0≤j<dk}\left(\left(j\over dk\right)\right)\left(
\left(hj+c\over k\right)\right)=\sum↓{\scriptstyle 0≤i<d\atop\scriptstyle 0≤j<k}
\left(\left(ik+j\over dk\right)\right)\left(\left(hj+c\over k\right)\right)$;
use (10) to sum on $i$.\xskip (b) $\dispstyle\left(\left(hj+c+\theta\over k\right)
\right)=\left(\left(hj+c\over k\right)\right)+{\theta\over k}-{1\over2}\delta
\left(hj+c\over k\right)$; now sum.
\vskip 12pt plus 3pt minus 9pt
\ansno 12. Since $\left(\left(hj+c\over k\right)\right)$ runs through the same
values as $\left(\left(j\over k\right)\right)$ in some order, Cauchy's
inequality implies that $\sigma(h,k,c)↑2≤\sigma(h,k,0)↑2$; and $\sigma(1,k,0)$
may be summed directly, cf.\ exercise 7.
\ansno 13. $\dispstyle\sigma(h,k,c)+{3(k-1)\over k}={12\over k}\sum↓{0<j<k}
{\omega↑{-cj}\over(\omega↑{-hj}-1)(\omega↑j-1)}+{6\over k}(c\mod k)
-6\left(\left(h↑\prime c\over k\right)
\right)$,\par\noindent if $hh↑\prime≡1\modulo k$.
\ansno 14. $(2↑{38}-3\cdot2↑{20}+5)/(2↑{70}-1)\approx2↑{-32}$. An extremely
satisfactory global value, in spite of the local nonrandomness!
\ansno 15. Replace $c↑2$ where it appears in (19) by $\lfloor c\rfloor\lceil
c\rceil$.
\ansno 16. The hinted identity is equivalent to $m↓1=p↓rm↓{r+1}+p↓{r-1}m↓{r+2}$
for $1≤r≤t$; this follows by induction, cf.\ also exercise 4.5.3--32. Now
replace $c↓j$ by $\sum↓{j≤r≤t}b↓rm↓{r+1}$ and compare coefficients of $b↓ib↓j$
on both sides of the identity to be proved.
({\sl Note:} For all exponents $e≥1$ we have
$$\sum↓{1≤j≤t}(-1)↑{j+1}{c↑e↓j\over m↓jm↓{j+1}}={1\over m↓1}\sum↓{1≤j≤t}
(-1)↑{j+1}b↓j{(c↑e↓j-c↑e↓{j+1})\over c↓j-c↓{j+1}}p↓{j-1}$$
by a similar argument.)
\ansno 17. During this algorithm we will have $k=m↓j$, $h=m↓{j+1}$, $c=c↓j$,
$p=p↓j-1$, $p↑\prime=p↓{j-2}$, $s=(-1)↑{j+1}$ for $j=1$, 2, $\ldotss$, $t+1$.
\algstep D1. [Initialize.] Set $A←0$, $B←h$, $p←1$, $p↑\prime
←0$, $s←1$.
\algstep D2. [Divide.] Set $a←\lfloor k/h\rfloor$, $b←\lfloor c/h\rfloor$,
$r←c\mod h$. (Now $a=a↓j$, $b=b↓j$, $r=c↓{j+1}$.)
\algstep D3. [Accumulate.] Set $A←A-(a-6b)s$, $B←B+6bp(c+r)s$. If $r≠0$ or
$c=0$, set $A←A-3s$. If $h=1$, set $B←B+ps$. (This subtracts $3e(m↓{j+1},c↓j)$
and also takes care of the $\sum(-1)↑{j+1}/m↓jm↓{j+1}$ terms.)
\algstep D4. [Prepare for next iteration.] Set $c←r$, $s←-s$; set $r←k-ah$,
$k←h$, $h←r$; set $r←ap+p↑\prime$, $p↑\prime←p$, $p←r$. If $h>0$, return to
D2.\quad\blackslug
\penalty-200 % This will help the following page, the way things are breaking
\yyskip At the conclusion of this algorithm, $p$ will be equal to the original value
$k↓0$ of $k$, so the desired answer will be $A+B/p$. The final value of $p↑\prime$
will be $h↑\prime$ if $s<0$, otherwise $p↑\prime$ will be $k↓0-h↑\prime$.
It would be possible to
maintain $B$ in the range $0≤B<k↓0$, by making appropriate adjustments to
$A$, thereby requiring only single-precision operations (with double-precision
products and dividends) if $k↓0$ is a single-precision number.
\ansno 18. A moment's thought shows that the formula
$$\textstyle S(h,k,c,z)=\sum↓{0≤j<k}
\biglp\lfloor j/k\rfloor-\lfloor(j-z)/k\rfloor\bigrp\biglp\biglp(hj+c)/k\bigrp
\bigrp$$ is in fact valid for all $z$, not only when $k≥z$. Writing
$\lfloor j/k\rfloor-\lfloor(j-z)/k\rfloor={z\over k}+\left(\left(j-z\over k\right)
\right)-\left(\left(j\over k\right)\right)+{1\over2}\delta↓{j0}-{1\over2}\delta
\left(j-z\over k\right)$
and carrying out the sums yields $S(h,k,c,z)=zd((c/d))/k+{1\over12}\sigma(h,k,hz+c)
-{1\over12}\sigma(h,k,c)+{1\over2}((c/k))-{1\over2}\biglp\biglp(hz+c)/k\bigrp
\bigrp$, where $d=\gcd(h,k)$.
% This paragraph break introduced to fill up a sparse page
[This formula allows us to express the probability that
$X↓{n+1}<X↓n<α$ in terms of generalized Dedekind sums, given $α$.]
\ansno 19. The desired probability is
$$
\def\\{\lower 1.36pt\nullα} % makes α have an effective tail like β
\halign{\hjust to size{$\textstyle#$\hfill}\cr
\sum↓{0≤x<m}\biglp\lfloor(x-α)/m\rfloor-
\lfloor(x-β)/m\rfloor\bigrp\biglp\lfloor(s(x)-α↑\prime)/m\rfloor-
\lfloor(s(x)-β↑\prime)/m\rfloor\bigrp/m\cr
\noalign{\vskip4pt}
\quad=\sum↓{0≤x<m}\left({β-α\over m}+\left(\left(x-β\over m
\right)\right)-\left(\left(x-\\\over m\right)\right)+{1\over2}\delta\left(x-\\
\over m\right)-{1\over2}\delta\left(x-β\over m\right)\right)\times\null\cr
\noalign{\penalty1000}
\def\biglp{\mathopen{\vcenter{\hjust{\:@\char0}}}}
\def\bigrp{\mathclose{\vcenter{\hjust{\:@\char1}}}}
\quad\qquad
\biglp{β↑\prime-α↑\prime\over m}+\biglp\biglp{ax+c-β↑\prime\over m}\bigrp \bigrp
-\biglp\biglp{ax+c-\\↑\prime\over m}\bigrp \bigrp +{1\over2}\delta\biglp
{ax+c-\\↑\prime\over m}\bigrp -{1\over 2}\delta\biglp{ax+c-β↑\prime\over m}\bigrp
\bigrp /m\cr
\noalign{\vskip4pt}
\quad={β-α\over m}{β↑\prime-α↑\prime\over m}+{1\over12m}
\biglp\sigma(a,m,c+aα-α↑\prime)-\sigma(a,m,c+aα-β↑\prime)+\null\cr
\noalign{\penalty1000}
\hskip 120pt\sigma(a,m,c+aβ-β↑\prime)
-\sigma(a,m,c+aβ-α↑\prime)\bigrp+ε,\cr}$$
where $|ε|≤2.5/m$.
[This approach is
due to U. Dieter. The discrepancy between the true probability and the
ideal value ${β-α\over m}{β↑\prime-α↑\prime\over m}$ is
bounded by $\sum↓{1≤j≤t}a↓j/4m$, according to Theorem K; conversely, by
choosing $α$, $β$, $α↑\prime$, $β↑\prime$ appropriately we will obtain a
discrepancy of at least half this bound when there are large partial quotients,
by the fact that Theorem K is ``best possible.'' Note that when $a
\approx \sqrt m$ the discrepancy cannot exceed $O\biglp1/\sqrt m\,\bigrp$, so
even the locally nonrandom generator of exercise 14 will look good on the
serial test over the full period; it appears that we should insist on an
{\sl extremely} small discrepancy.]
\ansno 20. $\sum↓{0≤x<m}\left.\left\lceil\biglp x-s(x)\bigrp/m\right\rceil
\left\lceil\biglp s(x)-s(s(x))\bigrp/m\right\rceil\right/m
\hskip0ptplus3pt=\hskip0ptplus3pt
\sum↓{0≤x<m}\biglp\biglp x-s(x)\bigrp/m+\biglp\biglp(bx+c)/m\bigrp\bigrp+{1\over2}
\bigrp\biglp\biglp s(x)-s(s(x))\bigrp/m+\biglp\biglp a(bx+c)/m\bigrp\bigrp
+{1\over2}\bigrp\hjust{\:a/}m$; and $x/m=((x/m))+{1\over2}-{1\over2}\delta(x/m)$,
$s(x)/m=\biglp\biglp(ax+c)/m\bigrp\bigrp+{1\over2}-{1\over2}\delta\biglp(ax+c)
/m\bigrp$, $s(s(x))/m=\biglp\biglp(a↑2x+ac+c)/m\bigrp\bigrp+{1\over2}-{1\over2}
\delta\biglp(a↑2x+ac+c)/m\bigrp$. Let $s(x↑\prime)=s(x↑{\prime\prime})=0$ and
$d=\gcd(b,m)$. The sum now reduces to
$$\twoline{{1\over4}+{1\over12m}(S↓1-S↓2+S↓3-S↓4+S↓5-S↓6+S↓7-S↓8+S↓9)+
{d\over2m}\biggglp\left(\left(c\over d\right)\right)+\left(\left(ac\over d\right)
\right)\bigggrp}{2pt}{\null+{1\over2m}\biggglp\left(\left(x↑\prime-x↑{\prime\prime}
\over m\right)\right)-\left(\left(x↑\prime\over m\right)\right)+\left(\left(
x↑{\prime\prime}\over m\right)\right)+\left(\left(ac+c\over m\right)\right)-
\left(\left(c\over m\right)\right)-{1\over2}\bigggrp,}$$
where $S↓1=\sigma(a,m,c)$, $S↓2=\sigma(a↑2,m,ac+c)$, $S↓3=\sigma(ab,m,ac)$,
$S↓4=\sigma(1,m,0)=(m-\penalty1000 1)\*(m-2)/m$, $S↓5=\sigma(a,m,c)$, $S↓6=\sigma(b,m,c)$,
$S↓7=-\sigma(a↑\prime-1,m,a↑\prime c)$, $S↓8=-\sigma(a↑\prime(a↑\prime-1),\penalty0
m,(a↑\prime)↑2c)$, if $a↑\prime a≡1\modulo m$; and
$$\eqalign{S↓9⊗=12\sum↓{0≤x<m}\left(\left(bx+c\over m\right)\right)
\left(\left(a(bx+c)\over m\right)\right)\cr⊗=12d\sum↓{0≤x<m/d}\left(\left(x+c↓0/d
\over m/d\right)\right)\left(\left(a(x+c↓0/d\over m/d\right)\right)\cr
⊗=12d\sum↓{0≤x<m/d}\biggglp\left(\left(x\over m/d\right)\right)+
{c↓0\over m}-{1\over2}\delta↓{x0}\bigggrp\left(\left(a(x+c↓0/d)\over m/d
\right)\right)\cr
⊗=d\biggglp\sigma(ad,m,ac↓0)+12{c↓0\over m}\left(\left(ac↓0\over d\right)\right)
-6\left(\left(ac↓0\over m\right)\right)\bigggrp\cr}$$
where $c↓0=c\mod d$. The grand total will be near $1\over6$ when $d$ is small
and when the fractions $a/m$, $(a↑2\mod m)/m$, $(ab\mod m)/m$, $b/m$, $(a↑\prime-1)
/m$, $(a↑\prime(a↑\prime-1)\mod m)/m$, $((ad)\mod m)/m$ all have small partial
quotients. (Note that $a↑\prime-1≡-b+b↑2-\cdotss$, cf.\ exercise 3.2.1.3--7.)
\ansno 21. $C=(s-({1\over2})↑2)/({1\over3}-({1\over2})↑2)$, where
$$\eqalign{s↓n⊗=\int↓{x↓n}↑{x↓{n+1}}x\{ax+\theta\}\,dx={1\over a↑2}\left(
{1\over3}-{\theta\over2}+{n\over 2}\right),\qquad
\hjust{if }x↓n={n-\theta\over a};\cr
s⊗=\int↓0↑1 x\{ax+\theta\}\,dx=s↓0+s↓1+\cdots+s↓{a-1}+\int↓{-\theta/a}↑0(ax+\theta)
\,dx\cr⊗={1\over3a}-{\theta\over2a}+{a-1\over4a}+{\theta↑2\over2a}.\cr}$$
Therefore $C=(1-6\theta+6\theta↑2)/a$.
\ansno 22. Let $[u,v)$ denote the set $\leftset x \relv u≤x<v\rightset$. We have
$s(x)<x$ in the disjoint intervals
$$\left[{1-\theta\over a},{1-\theta\over a-1}\right),\;
\left[{2-\theta\over a},{2-\theta\over a-1}\right),\;\ldotss,\;
\left[{a-\theta\over a},1\right),$$
which have total length
$$1+\sum↓{0<j≤a-1}\left(j-\theta\over a-1\right)-\sum↓{0<j≤a}\left(j-\theta\over
a\right)=1+{a\over2}-\theta-{a+1\over2}+\theta={1\over2}.$$
\ansno 23. We have $s(s(x))<s(x)<x$ when $x$ is in $\left[{k-\theta\over a},
{k-\theta\over a-1}\right)$ and $ax+\theta-k$ is in $\left[{j-\theta\over a},
{j-\theta\over a-1}\right)$, for $0<j≤k<a$; or when $x$ is in $\left[{a-\theta
\over a},1\right)$ and $ax+\theta-a$ is either in $\left[{j-\theta\over a},{j-\theta
\over a-1}\right)$ for $0<j≤\lfloor a\theta\rfloor$ or in
$\vcenter{\hjust{\:@\char2}}{\lfloor a\theta\rfloor+1-\theta\over a},\theta
\vcenter{\hjust{\:@\char1}}$.
The desired probability is
$$\twoline{\sum↓{0<j≤k<a}{j-\theta\over a↑2(a-1)}+\sum↓{0<j≤\lfloor a\theta\rfloor}
{j-\theta\over a↑2(a-1)}+{1\over a↑2}\max(0,\{a\theta\}+\theta-1)}{0pt}{
={1\over6}+{1\over6a}-{\theta\over2a}+{1\over a↑2}\left({\lfloor a\theta\rfloor
(\lfloor a\theta\rfloor+1-2\theta)\over2(a-1)}+\max(0,\{a\theta\}+\theta-1)\right),
}$$ which is ${1\over6}+(1-3\theta+3\theta↑2)/6a+O(1/a↑2)$ for large $a$. Note
that $1-3\theta+3\theta↑2≥{1\over4}$, so $\theta$ can't be chosen to make this
probability come out right.
%folio 687 galley 1 (C) Addison-Wesley 1978 *
\ansno 24. Proceed as in the previous exercise; the sum of the interval
lengths is
$$\sum↓{0<j↓1≤\cdots≤j↓{t-1}<a}\,{j↓1\over a↑{t-1}(a-1)}=
{1\over a↑{t-1}(a-1)}{a+t-2\choose t}.$$
To compute the average length, let $p↓k$ be the probability of a run of length
$≥k$; the average is
$$\sum↓{k≥1}p↓k=\sum↓{k≥1}{a+k-2\choose k}{1\over a↑{k-1}(a-1)}=\left(a\over
a-1\right)↑a-{a\over a-1}.$$
The value for a truly random sequence would be $e - 1$;
and our value is $e - 1 + (e/2 -\penalty1000 1)/a + O(1/a↑2)$. {\sl Note:} The
same result holds for an ascending run, since $U↓n > U↓{n+1}$
if and only if $1 - U↓n < 1 - U↓{n+1}$. This would lead us to
suspect that runs in linear congruential sequences might
be slightly longer than normal, so the run test should be
applied to such generators.
\ansno 25. $x$ must be in the interval
$[(k + α↑\prime - \theta)/a, (k + β↑\prime - \theta)/a)$ for some $k$,
and also in the interval $[α, β)$. Let $k↓0 = \lceil aα + \theta - β↑\prime \rceil$,
$k↓1 = \lceil aβ + \theta - β↑\prime \rceil$.
With due regard to boundary
conditions, we get the probability
$$(k↓1 - k↓0)(β↑\prime - α↑\prime )/a + \max
\biglp 0,β - (k↓1 + α↑\prime - \theta)/a\bigrp - \max\biglp
0, α - (k↓0 + α↑\prime - \theta)/a\bigrp.$$
This is $(β - α)(β↑\prime - α↑\prime ) +
\epsilon $, where $|\epsilon | < 2(β↑\prime - α↑\prime )/a$.
\ansno 26. See Fig$.$ A--1; the orderings $U↓1
< U↓3 < U↓2$ and $U↓2 < U↓3 < U↓1$ are impossible; the other
four each have probability ${1\over 4}$.
\topinsert{\vskip 28mm
\hjust to 48mm{\caption Fig. A--1. Permutation regions for
Fibonacci generator.}
\vskip 13mm
\hjust to 48mm{\caption Fig. A--2. Run-length regions for Fibonacci
generator.}}
\ansno 27. $U↓n = \{F↓{n-1}U↓0 + F↓nU↓1\}$. We need $F↓{k-1}U↓0
+ F↓kU↓1 < 1$ and $F↓kU↓0 + F↓{k+1}U↓1 > 1$. The half-unit-square
in which $U↓0 > U↓1$ is broken up as shown in Fig$.$ A--2, with
various values of $k$ indicated. The probability for a run of
length $k$ is ${1\over 2}$ if $k = 1$; $1/F↓{k-1}F↓{k+1} - 1/F↓kF↓{k+2}$,
if $k > 1$. The corresponding probabilities for a random sequence
are $2k/(k+1)!-2(k+1)/(k+2)!$; the following table compares the
first few values.
\ctrline{$
\vcenter{\halign{#⊗\quad\ctr{$#$}⊗\quad\ctr{$#$}⊗\quad\ctr{$#$}⊗\quad
\ctr{$#$}⊗\quad\ctr{$#$}\cr
\hfill $k$: \hfill⊗1⊗2⊗3⊗4⊗5\cr
Probability in Fibonacci case:\hfill⊗1\over 2⊗1\over
3⊗1\over 10⊗ 1\over 24⊗ 1\over 65\cr
Probability in random case:\hfill⊗1\over 3⊗5\over 12⊗11\over
60⊗19\over 360⊗29\over 2520\cr}}$}
\ansno 28. Fig$.$ A--3 shows the various regions in the
general case. The ``213'' region means $U↓2 < U↓1 < U↓3$, if
$U↓1$ and $U↓2$ are chosen at random; the ``321'' region means
$U↓3 < U↓2 < U↓1$, etc. The probabilities for 123 and 321 are
${1\over 4} - α/2 + α↑2/2$; the probabilities for all other
cases are ${1\over 8} + α/4 - α↑2/4$. To have all equal to
${1\over 6}$, we must have $1 - 6α + 6α↑2 = 0$.
[This exercise establishes a theorem due to J. N. Franklin,
{\sl Math$.$ Comp$.$ \bf 17} (1963), 28--59, Theorem 13); other results
of Franklin's paper are related to exercises 22 and 23.]
\topinsert{\vskip 104mm
\ctrline{\caption Fig. A--3. Permutation regions for a generator
with potency 2; $α = (a - 1)c/m$.}}
\ansbegin{3.3.4}
\ansno 1. $\nu ↓1$ is always $m$ and
$\mu ↓1 = 2$, for generators of maximum period.
\ansno 2. Let $V$ be the matrix whose rows are
$V↓1, \ldotss , V↓t$. To minimize $Y \cdot Y$, subject to the
condition that $Y ≠ (0, \ldotss , 0)$ and $VY$ is an integer
column vector $X$, is equivalent to minimizing $(V↑{-1}X) \cdot
(V↑{-1}X)$, subject to the condition that $X$ is a nonzero integer
column vector. The columns of $V↑{-1}$ are $U↓1$, $\ldotss$, $U↓t$.
\ansno 3. $a↑2 ≡ 2a - 1$ and $a↑3 ≡ 3a - 2 \modulo
m$. By considering all short solutions of (15), we find that
$\nu↓3↑2 = 6$ and $\nu↓4↑2 = 4$, for the respective
vectors $(1, -2, 1)$ and $(1, -1, -1, 1)$, except in the following
cases: $m = 2↑eq$, $q$ odd, $e ≥ 3$, $a ≡ 2↑{e-1} \modulo {2↑e}$,
$a ≡ 1 \modulo q$, $\nu↓3↑2 = \nu↓4↑2 = 2$; $m =
3↑eq$, $\gcd(3, q) = 1$, $e ≥ 2$, $a ≡ 1 \pm 3↑{e-1} \modulo {3↑e}$,
$a ≡ 1 \modulo q$, $\nu↓4↑2 = 2$; $m = 9$, $a = 4$ or 7,
$\nu↓2↑2 = \nu↓3↑2 = 5$.
\ansno 4. (a) The unique choice for $(x↓1, x↓2)$
is ${1\over m}(y↓1u↓{22} - y↓2u↓{21}, -y↓1u↓{12} + y↓2u↓{11})$,
and this is $≡ {1\over m}(y↓1u↓{22} + y↓2au↓{22}, -y↓1u↓{12}
- y↓2au↓{12}) ≡ (0, 0) \modulo 1$; i.e., $x↓1$ and $x↓2$ are
integers.\xskip (b) When $(x↓1, x↓2) ≠ (0, 0)$, we have $(x↓1u↓{11}
+ x↓2u↓{21})↑2 + (x↓1u↓{12} + x↓2u↓{22})↑2 = x↓1↑2(u
↓{11}↑2 + u↓{12}↑2) + x↓2↑2(u↓{21}↑2 + u↓{22}↑2)
+ 2x↓1x↓2(u↓{11}u↓{21} + u↓{12}u↓{22}) ≥ (x↓1↑2
+ x↓2↑2 - |x↓1x↓2|)(u↓{11}↑2 + u↓{12}↑2)
≥ u↓{11}↑2+u↓{12}↑2$. [Note that this is a stronger
result than Lemma A, which tells us only that $x↓1↑2
≤ (u↓{11}↑2 + u↓{12}↑2)(u↓{21}↑2 + u↓{22}↑2)
/m↑2$, $x↓2↑2 ≤ (u↓{11}↑2 + u↓{12}↑2)↑2/m↑2$,
and the latter can be $≥1$. The idea is essentially Gauss's notion
of a reduced binary quadratic form, {\sl Disq.\ Arith.\ } (Leipzig,
1801), $\char'570 171$.]
\ansno 5. Conditions (30) remain invariant; hence
$h$ cannot be zero in step S2, when $a$ is relatively prime
to $m$. Since $h$ always decreases in that step, S2 eventually
terminates with $u↑2 + v↑2 ≥ s$. Note that $pp↑\prime ≤ 0$ throughout
the calculation.
The hinted inequality surely holds the first time step S2 is performed.
The integer $q↑\prime$ which minimizes $(h↑\prime - qh)↑2
+ (p↑\prime - q↑\prime p)↑2$ is $q↑\prime = \hjust{round}\biglp(h↑\prime
h + p↑\prime p)/(h↑2 + p↑2)\bigrp$, by (24).
If $(h↑\prime - q↑\prime h)↑2 + (p↑\prime - q↑\prime p)↑2 <
h↑2 + p↑2$ we must have $q↑\prime ≠ 0$, $q↑\prime ≠ -1$, hence
$(p↑\prime - q↑\prime p)↑2 ≥ p↑2$, hence $(h↑\prime - q↑\prime
h)↑2 < h↑2$, i.e., $|h↑\prime - q↑\prime h| < h$, i.e., $q↑\prime$ is
$q$ or $q + 1$. We have $hu + pv ≥ h(h↑\prime - q↑\prime h)
+ p(p↑\prime - q↑\prime p) ≥ -{1\over 2}(h↑2 + p↑2)$, so if
$u↑2 + v↑2 < s$ the next iteration of step S2 will preserve
the assumption in the hint. If $u↑2 + v↑2 ≥ s > (u - h)↑2 +
(v - p)↑2$, we have $2|h(u - h) + p(v - p)| = 2\biglp h(h -
u) + p(p - v)\bigrp = (u - h)↑2 + (v - p)↑2 + h↑2
+ p↑2 - (u↑2 + v↑2) ≤ (u - h)↑2 + (v - p)↑2 ≤ h↑2 + p↑2$, hence
$(u - h)↑2 + (v - p)↑2$ is minimal by exercise 4. Finally if
both $u↑2 + v↑2$ and $(u - h)↑2 + (v - p)↑2$ are $≥s$, let $u↑\prime
= h↑\prime - q↑\prime h$, $v↑\prime = p↑\prime - q↑\prime p$;
then $2|hu↑\prime + pv↑\prime | ≤ h↑2 + p↑2 ≤ {u↑\prime}↑2 +
{v↑\prime}↑2$, and $h↑2 + p↑2$ is minimal by exercise 4.
\ansno 6. If $u↑2 + v↑2 ≥ s > (u - h)↑2 + (v -
p)↑2$ in the previous answer, we have $(v - p)↑2 > v↑2$, hence
$(u - h)↑2 < u↑2$; and if $q = a↓j$, so that $h↑\prime = a↓jh
+ u$, we must have $a↓{j+1} = 1$. It follows that $\nu
↓2↑2 = \min↓{0≤j<t}(m↓j↑2 + p↑{2}↓{j-1})$, in
the notation of exercise 3.3.3--16.
Now $m↓0 = m↓jp↓j + m↓{j+1}p↓{j-1} = a↓jm↓jp↓{j-1} + m↓jp↓{j-2}
+ m↓{j+1}p↓{j-1} < (a↓j + 1 + 1/a↓j)m↓jp↓{j-1} ≤ (A + 1 + 1/A)m↓jp↓{j-1}$,
and $m↓j↑2 + p↑{2}↓{j-1} ≥ 2m↓jp↓{j-1}$, hence the result.
\ansno 7. We shall prove, using condition (19),
that $U↓j \cdot U↓k = 0$ for all $k ≠ j$ iff $V↓j \cdot V↓k
= 0$ for all $k ≠ j$. Assume that $U↓j \cdot U↓k = 0$ for all
$k ≠ j$, and let $U↓j = α↓1V↓1 + \cdots + α↓tV↓t$. Then $U↓j
\cdot U↓k = α↓k$ for all $k$, hence $U↓j = α↓jV↓j$, and $V↓j
\cdot V↓k = α↑{-1}↓{j}(U↓j \cdot V↓k) = 0$ for all $k ≠ j$.
A symmetric argument proves the converse.
\ansno 8. Clearly $\nu↓{t+1} ≤ \nu ↓t$ (a fact used
implicitly in Algorithm S, since $s$ is not changed when $t$
increases). For $t = 2$ this is equivalent to $(m\mu ↓2/π)↑{1/2}
≥ ({3\over 4}m\mu ↓3/π)↑{1/3}$, i.e., $\mu ↓3 ≤ {4\over 3} \sqrt{m/π}\,
\mu ↑{3/2}↓{2}$. This reduces to ${4\over 3}10↑{-4}/
\sqrt{π}$ with the given parameters, but for large
$m$ and fixed $\mu ↓2$ the bound (39) is better.
%folio 690 galley 2 (C) Addison-Wesley 1978 *
\ansno 9. Let $f(y↓1, \ldotss , y↓t) = \theta $; then
$\gcd(y↓1, \ldotss , y↓t) = 1$, so there is an integer matrix
$W$ of determinant 1 having $(w↓1, \ldotss , w↓t)$ as its first
row. (Prove the latter fact by induction on the magnitude of
the smallest nonzero entry in the row.) Now if $X = (x↓1, \ldotss
, x↓t)$ is a row vector, we have $XW = X↑\prime$ iff $X = X↑\prime
W↑{-1}$, and $W↑{-1}$ is an integer matrix of determinant 1,
hence the form $g$ defined by $WU$ satisfies $g(x↓1, \ldotss
, x↓t) = f(x↓1↑\prime , \ldotss , x↓t↑\prime )$;
furthermore $g(1, 0, \ldotss , 0) = \theta $.
Without loss of generality, assume that $f = g$.
If now $S$ is any orthogonal matrix, the matrix $US$ defines
the same form as $U$, since $(XUS)(XUS)↑T = (XU)(XU)↑T$. Choosing
$S$ so that its first column is a multiple of $V↓1$ and its
other columns are any suitable vectors, we have
$$US=\left(\vcenter{\halign{\ctr{$#$}⊗\ctr{$#$}\cr
α↓1⊗\quad α↓2\quad \ldots\quad α↓t\cr
0\cr
\vcenter{\baselineskip4pt\lineskip0pt\hjust{.}\hjust{.}\hjust{.}}⊗U↑\prime\cr
\noalign{\vskip 2pt}
0\cr}}\right)$$
for some $α↓1$, $α↓2$, $\ldotss$, $α↓t$ and some $(t - 1)
\times (t - 1)$ matrix $U↑\prime $. Hence $f(x↓1, \ldotss , x↓t)
= (α↓1x↓1 + \cdots + α↓tx↓t)↑2 + h(x↓2, \ldotss , x↓t)$. It follows
that $α↓1 = \sqrt{\theta }$ \raise 8.5pt\hjust{\:@\char'2}\! % 12pt left bracket
in fact, $α ↓j = (U↓1 \cdot U↓j)/ \sqrt{\theta }$
for $1 ≤ j ≤ t$\raise 8.5pt\hjust{\:@\char'3}, %12pt right bracket
and that $h$ is a positive definite quadratic
form defined by $U↑\prime $, where $\det U↑\prime = (\det U)/
\sqrt{\theta }$. By induction on $t$, there are integers
$(x↓2, \ldotss , x↓t)$ with $h(x↓2, \ldotss , x↓t) ≤ ({4\over
3})↑{(t-2)/2} |\det U|↑{2/(t-1)}/\theta ↑{1/(t-1)}$, and for
these integer values we can choose $x↓1$ so that $|x↓1 + (α↓2x↓2
+ \cdots + α↓tx↓t)/α↓1| ≤ {1\over 2}$, i.e., $(α↓1x↓1 + \cdots
+ α↓tx↓t)↑2 ≤ {1\over 4}\theta $. Hence $\theta ≤ f(x↓1, \ldotss
, x↓t) ≤ {1\over 4}\theta + ({4\over 3})↑{(t-2)/2} |\det U|↑{2/(t-1)}/\theta
↑{1/(t-1)}$ and the desired inequality follows immediately.
[For $t = 2$ the result is best possible. For
general $t$, Hermite's theorem implies that
$\mu ↓t ≤ π↑{t/2}(4/3)↑{t(t-1)/4}/(t/2)!\,$.
A fundamental theorem due to Minkowski (``Every $t$-dimensional
convex set symmetric about the origin with volume $≥2↑t$ contains
a nonzero integer point'') gives $\mu ↓t ≤ 2↑t$; this is stronger
than Hermite's theorem for $t ≥ 9$. And even stronger results
are known, cf.\ (41).]
\ansno 10. Since $y↓1$ and $y↓2$ are relatively prime, we can
solve $u↓1y↓2 - u↓2y↓1 = m$; furthermore $(u↓1 + qy↓1)y↓2 -
(u↓2 + qy↓2)y↓1 = m$ for all $q$, so we can ensure that $2|u↓1y↓1
+ u↓2y↓2| ≤ y↓1↑2 + y↓2↑2$ by choosing an appropriate
integer $q$. Now $y↓2(u↓1 + au↓2) ≡ y↓2u↓1 - y↓1u↓2 ≡ 0 \modulo
m$, and $y↓2$ must be relatively prime to $m$, hence $u↓1
+ au↓2 ≡ 0$. Finally let $|u↓1y↓1 + u↓2y↓2| = αm$, $u↓1↑2
+ u↓2↑2 = βm$, $y↓1↑2 + y↓2↑2 = \gamma m$;
we have $0 ≤ α ≤ {1\over 2}\gamma $, and it remains to be shown
that $α ≤ {1\over 2}β$ and $β\gamma ≥ 1$. The identity $(u↓1y↓2
- u↓2y↓1)↑2 + (u↓1y↓1 + u↓2y↓2)↑2 = (u↓1↑2 + u
↓2↑2)(y↓1↑2 + y↓2↑2)$ implies that $1 + α↑2 =
β\gamma $. If $α > {1\over 2}β$, we have $2α\gamma > 1 + α↑2$,
i.e., $\gamma - \sqrt{\gamma ↑2 - 1} < α ≤
{1\over 2}\gamma $. But ${1\over 2}\gamma <
\sqrt{\gamma ↑2 - 1}$ implies that $\gamma ↑2 > {4\over 3}$,
a contradiction.
\ansno 11. Since $a$ is odd, $y↓1 + y↓2$ must be even. To avoid
solutions with $y↓1$ and $y↓2$ both even, let $y↓1 = x↓1 + x↓2$,
$y↓2 = x↓1 - x↓2$, and solve $x↓1↑2 + x↓2↑2 = m/
\sqrt{3} - \epsilon $, with $(x↓1, x↓2)$ relatively
prime and $x↓1$ even; the corresponding multiplier $a$ will
be the solution to $(x↓2 - x↓1)a ≡ x↓2 + x↓1 \modulo {2↑e}$.
It is not difficult to prove that $a ≡ 1 \modulo {2↑{k+1}}$
iff $x↓1 ≡ 0 \modulo {2↑k}$, so we get the best potency when
$x↓1 \mod 4 = 2$. The problem reduces to finding relatively prime
solutions to $x↓1↑2 + x↓2↑2 = N$ where $N$ is
a large integer of the form $4k + 1$. By factoring $N$ over
the Gaussian integers, we can see that solutions exist if and
only if each prime factor of $N$ (over the usual integers) has
the form $4k + 1$.
According to a famous theorem of Fermat, every
prime $p$ of the form $4k + 1$ can be written $p = u↑2 + v↑2
= (u + iv)(u - iv)$, $v$ even, in a unique way except for the
signs of $u$ and $v$. The numbers $u$ and $v$ can be calculated
efficiently by solving $x↑2 ≡ -1 \modulo p$, then calculating
$u + iv = \gcd(x + i, p)$ by Euclid's algorithm over the Gaussian
integers. [We can take $x = n↑{(p-1)/4} \mod p$ for almost
half of all integers $n$. This application of a Euclidean algorithm
is essentially the same as finding the least nonzero $u↑2 +
v↑2$ such that $u \pm xv ≡ 0 \modulo p$.] If the prime factorization
of $N$ is $p↑{e↓1}↓{1} \ldotss p↑{e↓r}↓{r} = (u↓1 + iv↓1)↑{e↓1}
(u↓1 - iv↓1)↑{e↓1} \ldotss (u↓r + iv↓r)↑{e↓r}(u↓r
- iv↓r)↑{e↓r}$, we get $2↑{r-1}$ distinct solutions to
$x↓1↑2 + x↓2↑2 = N$, $\gcd(x↓1, x↓2) = 1$, $x↓1$
even, by letting $|x↓2| + i|x↓1| = (u↓1 + iv↓1)↑{e↓1}(u↓2
\pm iv↓2)↑{e↓2}\ldotss(u↓r \pm iv↓r)↑{e↓r}$; and all
such solutions are obtained in this way.
{\sl Note:} When $m = 10↑e$, a similar procedure
can be used, but it is five times as much work since we must
keep trying until finding a solution with $x↓1 ≡ 0 \modulo
{10}$. For example, when $m = 10↑{10}$ we have $\lfloor m/
\sqrt{3}\rfloor = 5773502691$, and $5773502689 = 53 \cdot
108934013 = (7 + 2i)(7 - 2i)(2203 + 10202i)(2203 - 10202i)$.
Of the two solutions $|x↓2| + i|x↓1| = (7 + 2i)(2203 + 10202i)$
or $(7 + 2i)(2203 - 10202i)$, the former gives $|x↓1| = 67008$
(no good) and the latter gives $|x↓1| = 75820$, $|x↓2| = 4983$
(which is usable). Line 9 of Table 1 was obtained by taking
$x↓1 = 75820$, $x↓2 = -4983$.
Line 20 of the table was obtained as follows:
$\lfloor 2↑{35}/\sqrt{3}\rfloor = 19837604196$; 19837604193
is divisible by 3 so it is ineligible; 19837604189 is divisible
by 19, and 19837604185 by 7, and 19837604181 by 3; 19837604177
is prime and equals $131884↑2 + 49439↑2$. The corresponding multiplier
is 1175245817; a better one could be found if we continued searching.
The multiplier on line 24 was obtained as the best of the first sixteen
multipliers found by this procedure when $m=2↑{32}$.
%folio 693 galley 3 (C) Addison-Wesley 1978 *
\ansno 12. ${U↓j}↑\prime \cdot {U↓j}↑\prime=U↓j \cdot U↓j+2\sum
↓{i≠j}q↓iU↓i \cdot U↓j + \sum ↓{i≠j}\sum ↓{k≠j}q↓iq↓kU↓i \cdot
U↓k$. The partial derivative with respect to $q↓k$ is twice
the left-hand side of (26). If the minimum can be achieved,
these partial derivatives must all vanish.
\ansno 13. $u↓{11} = 1$, $u↓{21} =$ irrational, $u↓{12} = u↓{22} = 0$.
\ansno 14. After three Euclidean steps we find $\nu↓2↑2
= 5↑2 + 5↑2$, then S4 produces
\def\\#1{\left(\vcenter{\halign{\hfill$##$\quad⊗\hfill$##$\quad⊗\hfill$##$\cr#1}}
\right)}
$$U = \\{-5⊗5⊗0\cr-18⊗-2⊗0\cr1⊗-2⊗1\cr},\qquad
V=\\{-2⊗18⊗38\cr-5⊗-5⊗-5\cr0⊗0⊗100\cr}.$$
\def\¬{\char'403} % asterisk centered on axis
Transformations $(j, q↓1, q↓2, q↓3) = (1, \¬, 0,
2)$, $(2, -4, \¬, 1)$, $(3, 0, 0, \¬)$, $(1, \¬, 0, 0)$ result in
$$U = \\{-3⊗1⊗2\cr-5⊗-8⊗-7\cr1⊗-2⊗1\cr},\qquad V=\\{-22⊗-2⊗18\cr
-5⊗-5⊗-5\cr9⊗-31⊗29\cr},\qquad Z = (0\quad0\quad1).$$
Thus $\nu ↓3 = \sqrt{6}$, as
we already knew from exercise 3.
\ansno 15. The largest achievable $q$ in (11), minus the smallest
achievable, plus 1, is $|u↓1| + \cdots + |u↓t| - \delta $, where
$\delta = 1$ if $u↓iu↓j <
0$ for some $i$ and $j$, otherwise $\delta = 0$. For example
if $t = 5$, $u↓1 > 0$, $u↓3 > 0$, $u↓4 = 0$, and $u↓5 < 0$, the largest
achievable value is $q = u↓1 + u↓2 + u↓3 - 1$ and the smallest
is $q = u↓5 + 1 = -|u↓5| + 1$.
[Note that the number of hyperplanes is unchanged
when $c$ varies, hence the same answer applies to the problem
of covering $L$ instead of $L↓0$. However, the stated formula
is {\sl not} always exact for covering $L↓0$, since the hyperplanes
which intersect the unit hypercube may not all contain points
of $L↓0$. In the example above, we can never achieve the value $q = u↓1
+ u↓2 + u↓3 - 1$ in $L↓0$ if $u↓1 + u↓2 + u↓3 > m$; it is achievable iff
there is a solution to $m - u↓1 - u↓2 - u↓3 = x↓1u↓1 + x↓2u↓2
+ x↓3u↓3 + x↓4|u↓5|$ in nonnegative integers $(x↓1, x↓2, x↓3,
x↓4)$. It may be true that the stated limits are always achievable
when $|u↓1| + \cdots +|u↓t|$ is minimal, but this does not appear
to be obvious.]
\ansno 16. It suffices to determine all solutions to (15) having
minimum $|u↓1| + \cdots +|u↓t|$, subtracting 1 if any one of
these solutions has components of opposite sign.
Instead of positive definite quadratic forms, we
work with $f(x↓1, \ldotss , x↓t) = |x↓1U↓1 + \cdots + x↓tU↓t|$,
defining $|Y| = |y↓1| + \cdots + |y↓t|$. Inequality (21) can
be replaced by $|x↓k| ≤ (\max↓{1≤j≤t}|v↓{kj}|)\,f(y↓1,
\ldotss , y↓t)$.
Thus a workable algorithm can be obtained as follows.
Replace steps S1 through S3 by: ``Set $U ← (m)$, $V ← (1)$, $r ←
1$, $s ← m$, $t ← 1$.'' (Here $U$ and $V$ are $1 \times 1$ matrices;
thus the two-dimensional case will be handled by the general
method. A special procedure for $t = 2$ could, of course, be
devised.) In steps S4 and S8, set $s ← \min(s, |U↓k|)$. In
step S8, set $z↓k ← \lfloor \max↓{1≤j≤t}|v↓{kj}|\,s/m\rfloor$.
In step S10, set $s ← \min(s, |Y| - \delta )$; and in step
S11, output $s = N↓t$. Otherwise leave the algorithm as it stands,
since it already produces suitably short vectors. [{\sl Math.\
Comp.\ \bf 29} (1975), 827--833.]
\ansno 17. When $k > t$ in S10, and if $Y \cdot Y ≤ s$, output
$Y$ and $-Y$; furthermore if $Y \cdot
Y < s$, take back the previous output of vectors for this $t$.
[In the author's experience preparing Table 1, there was
exactly one vector (and its negative) output
for each $\nu ↓t$, except when $y↓1 = 0$ or $y↓t = 0$.]
\ansno 18. (a) Let $x = m$, $y = (1 - m)/3$, $v↓{ij} = y +
x\delta ↓{ij}$, $u↓{ij} = -y + \delta ↓{ij}$. Then $V↓j \cdot
V↓k = {1\over 3}(m↑2 - 1)$ for $j ≠ k$, $V↓k \cdot V↓k = {2\over
3}(m↑2 + {1\over 2})$, $U↓j \cdot U↓j = {1\over 3}(m↑2 + 2)$, $z↓k
\approx \sqrt{2\over 9}\,m$. $\biglp$This example satisfies
(28) with $a = 1$ and works for all $m ≡ 1 \modulo 3$.$\bigrp$
(b) Interchange the
r\A oles of $U$ and $V$ in step S5. Also set $s ← \min(s, U↓i
\cdot U↓i)$ for all $U↓i$ that change. For example, when $m
= 64$ this transformation with $j = 1$, applied to the matrices
of (a), reduces
$$V = \\{43⊗-21⊗-21\cr-21⊗43⊗-21\cr-21⊗-21⊗43\cr},\quad
U=\\{22⊗21⊗21\cr21⊗22⊗21\cr21⊗21⊗22\cr}$$
\vskip-12pt\hjust{to}\vskip-18pt
$$V=\\{1⊗1⊗1\cr-21⊗43⊗-21\cr-21⊗-21⊗43\cr},\quad
U=\\{22⊗21⊗21\cr-1⊗1⊗0\cr-1⊗0⊗1\cr}.$$
[Since the transformation can increase the
length of $V↓j$, an algorithm which incorporates both transformations
must be careful to avoid infinite looping.]
\ansno 19. No, since a product of non-identity matrices with
all off-diagonal elements nonnegative and all diagonal elements
1 cannot be the identity.
%This paragraph break introduced to fill up a sparse page
[However, looping would be possible
if a subsequent transformation with $q = -1$ were performed
when $-2V↓i \cdot V↓j = V↓j \cdot V↓j$; the rounding rule must
be asymmetric with respect to sign if non-shortening transformations
are allowed.]
\ansno 20. Use the ordinary spectral test for $a$ and $m = 2↑{e-2}$;
cf.\ exercise 3.2.1.2--9. [On intuitive grounds, the same answer
should apply also when $a \mod 8 = 3$.]
\ansno 21. $X↓{4n+4} ≡ X↓{4n} \modulo 4$, so it is appropriate
to let $V↓1 = (4, 4a↑2, 4a↑3)/m$, $V↓2 = (0, 1, 0)$, $V↓3 = (0,
0, 1)$ define the corresponding lattice $L↓0$.
\ansno 24. Let $m = p$; an analysis paralleling the text can
be given. For example, when $t = 4$ we have $X↓{n+3} = \biglp
(a↑2 + b)X↓{n+1} + abX↓n\bigrp \mod n$, and we want to minimize
$u↓1↑2 + u↓2↑2 + u↓3↑2 + u↓4↑2 ≠
0$ such that $u↓1 + bu↓3 + abu↓4≡ u↓2 + au↓3 + (a↑2 + b)u↓4
≡ 0 \modulo m$.
Replace steps S1 through S3 by the operations of
setting
$$U ← \left({m\atop 0}\quad {0\atop m}\right),\qquad V ← \left({1\quad
0\atop 0\quad 1}\right),\qquad R ← \left({1\quad 0\atop 0\quad 1}\right),\qquad
s ← m↑2,\qquad t ← 2,$$
and outputting $\nu ↓2 = m$. Replace step S4 by
\algstep S4$↑\prime$. [Advance $t$.]
If $t = T$, the algorithm terminates. Otherwise set $t ← t +
1$ and $R ← R\left({0\atop 1}\,{b\atop a}\right)\mod m$. Set $U↓t$
to the new row $(-r↓{12}, -r↓{22}, 0, \ldotss , 0, 1)$ of $t$
elements, and set $u↓{it} ← 0$ for $1 ≤ i < t$. Set $V↓t$ to
the new row $(0, \ldotss , 0, m)$. For $1 ≤ i < t$, set $q ←
\hjust{round}\biglp (v↓{i1}r↓{12} + v↓{i2}r↓{22})/m\bigrp$, $v↓{it}
← v↓{i1}r↓{12} + v↓{i2}r↓{22} - qm$, and $U↓t ← U↓t + qU↓i$.
Finally set $s ← \min(s, U↓t \cdot U↓t)$, $k ← t$, $j ← 1$.
\yskip [A similar generalization applies to all
sequences of length $p↑k-1$ defined by 3.2.2--8. Additional numerical examples
have been given by A. Grube, {\sl Zeitschrift f\"ur angewandte Math.\ und
Mechanik \bf53} (1973), T223--T225.]
\ansno 25. The given sum is at most two times the quantity $\sum↓{0≤k≤m/2d}r(dk)
=1+{1\over d}f(m/d)$, where
\penalty-200 %This will help to fill up a sparse page
$$\eqalign{f(m)⊗={1\over m}\sum↓{1≤k≤m/2}\csc(πk/m)\cr
⊗={1\over m}\int↓1↑{m/2}\csc(πx/m)\,dx+O\left(1\over m\right)={1\over π}\left.\ln
\tan\left({π\over2m}x\right)\lower9pt\null\right|↓1↑{m/2}
+O\left(1\over m\right).\cr}$$ [When
$d=1$, we have $\sum↓{0≤k<m}r(k)=(2/π)\ln m+1+(2/π)\ln(2e/π)+O(1/m)$.]
\ansno 26. When $m=1$, we cannot use (52) since $k$ will be zero. If
$\gcd(q,m)=d$, the same derivation goes through with $m$ replaced by $m/d$.
Suppose we have $m=p↓1↑{e↓1}\ldotss p↓r↑{e↓r}$ and $\gcd(a-1,m)=p↓1↑{f↓1}\ldotss
p↓r↑{f↓r}$ and $d=p↓1↑{d↓1}\ldotss p↓r↑{d↓r}$. If $m$ is replaced by $m/d$
then $s$ is replaced by
$p↓1↑{\max(0,e↓1-f↓1-d↓1)}\ldotss p↓r↑{\max(0,e↓r-f↓r-d↓r)}$.
\ansno 27. The result is clear if $\nu↓t$ is small, say $\nu↓t<2(t-1)$, because
$\sum r(u↓1,\ldotss,u↓t)$ summed over {\sl all} $u$ is $O\left(\biglp(2/π)\ln
m\bigrp↑t\right)$.
If $\nu↓t$ is large, it is convenient to use the following functions:
$\rho(x)=1$ if $x=0$, $\rho(x)=x$ if $0<x≤m/2$, $\rho(x)=m-x$ if $m/2<x<m$;
$\hjust{trunc}(x)=\lfloor x/2\rfloor$ if $0≤x≤m/2$, $\hjust{trunc}(x)=m-\lfloor
(m-x)/2\rfloor$ if $m/2<x<m$;
$L(x)=1$ if $x=0$, $L(x)=\lfloor\lg x\rfloor+1$ if $0<x≤m/2$, $L(x)=-\biglp\lfloor
\lg(m-x)\rfloor+1\bigrp$ if $m/2<x<m$; and $l(x)=\max\biglp 1,2↑{|x|-1}\bigrp$.
Note that $l\biglp L(x)\bigrp≤\rho(x)≤2l\biglp L(x)\bigrp$.
Say that a vector $(u↓1,\ldotss,u↓t)$ is {\sl bad} if it is nonzero and satisfies
(15). By hypothesis, all bad vectors satisfy $u↓1↑2+\cdots+u↓t↑2≥\nu↓t↑2≥2(t-1)$,
and it follows that $\rho(u↓1)\ldotsm\rho(u↓t)≥\sqrt{\nu↓t↑2-(t-1)}≥\nu↓t/\sqrt2$.
The vector
$(u↓1,\ldotss,u↓t)$ is said to be in class $\biglp L(u↓1),\ldotss,L(u↓t)\bigrp$;
thus there are at most $(2\lg m+1)↑t$ classes, and class $(L↓1,\ldotss,L↓t)$
contains at most $l(L↓1)\ldotsm l(L↓t)$ vectors. Our proof is based on showing that
the bad vectors in each fixed class contribute only $O(1/\nu↓t)$
to $\sum r(u↓1,\ldotss,r↓t)$.
Let $\mu=\lceil\,\lg\nu↓t/\sqrt2-1\rceil$. The {\sl $\mu$-fold truncation operator}
on a vector is defined to be the following operation repeated $\mu$ times: ``Let
$j$ be minimal such that $\rho(u↓j)>1$, and replace $u↓j$ by $\hjust{trunc}(u↓j)$;
but do nothing if $\rho(u↓j)=1$ for all $j$.'' $\biglp$This
operation essentially throws away one bit of
information about $(u↓1,\ldotss,u↓t)$.$\bigrp$
If $(u↓1↑\prime,\ldotss,u↓t↑\prime)$ and $(u↓1↑{\prime\prime},\ldotss,
u↓t↑{\prime\prime})$ are two vectors of the same class having the same $\mu$-fold
truncation, we say they are {\sl similar}\/; in this case it follows that
$\rho(u↓1↑\prime-u↓1↑{\prime\prime})\ldotsm\rho(u↓t↑\prime-u↓t↑{\prime\prime})≤
2↑\mu<\nu↓t/\sqrt2$. For example, any two vectors of the form $\biglp
(1x↓2x↓1)↓2,0,m-(1x↓3)↓2,(101x↓5x↓4)↓2,(1101)↓2\bigrp$ are similar when $m$
is large and $\mu=5$; the $\mu$-fold truncation operator successively removes
$x↓1$, $x↓2$, $x↓3$, $x↓4$, $x↓5$. Since the difference of two bad vectors
satisfies (15), it is impossible for two unequal bad vectors to be similar.
Therefore class $(L↓1,\ldotss,L↓t)$ can contain at most $\max\biglp 1,
l(L↓1)\ldotsm l(L↓t)/2↑\mu\bigrp$ bad vectors.
If class $(L↓1,\ldotss,L↓t)$ contains exactly one bad vector $(u↓1,\ldotss,u↓t)$,
we have $r(u↓1,\ldotss,u↓t)≤1/\rho(u↓1)\ldotsm\rho(u↓t)≤\sqrt2/\nu↓t$; if it
contains $≤l(L↓1)\ldotsm l(L↓t)/2↑\mu$ bad vectors, each of them has
$r(u↓1,\ldotss,u↓t)≤1/l(L↓1)\ldotsm l(L↓t)$.
\def\\{{\:a[}}\def\¬{{\:a]}}
\\Korobov's test, mentioned at the close of the section, was to find the minimum
of $\rho(u↓1)\ldotsm\rho(u↓t)$ over all bad $(u↓1,\ldotss,u↓t)$.\¬
\ansno 28. Let $\zeta=e↑{2πi/(m-1)}$ and let $S↓{kl}=\sum↓{0≤j<m-1}\omega
↑{x↓{j+l}}\zeta↑{jk}$. The analog of (51) is $|S↓{k0}|=\sqrt m$, hence the
analog of (53) is $\hjust{\:u\char'152}N↑{-1}\sum↓{0≤n<N}\omega↑{x↓n}
\hjust{\:u\char'152}=O\biglp(\sqrt m\log m)/N\bigrp$. The analogous theorem
now states that
$$D↓N↑{(t)}=O\left(\sqrt m(\log m)↑{t+1}\over N\right)+O\left((\log m)↑t/\nu↓t
\right),\qquad D↓{m-1}↑{(t)}=O\left((\log m)↑t\over\nu↓t\right).$$
In fact, $D↓{m-1}↑{(t)}≤{m-2\over m-1}\sum r(u↓1,\ldotss,u↓t)\;$\\summed
over nonzero solutions of (15)\¬$\null+{1\over m-1}\sum r(u↓1,\ldotss,u↓t)\;
$\\summed over all nonzero $(u↓1,\ldotss,u↓t)$\¬.
Let $R(a)=\sum r(u↓1,\ldotss,u↓t)$ summed over nonzero solutions of (15). Since
$m$ is prime, each $(u↓1,\ldotss,u↓t)$ can be a solution to (15) for at most
$t-1$ values of $a$, hence $\sum↓{0<a<m}R(a)≤(t-1)\sum r(u↓1,\ldotss,u↓t)=
O\biglp t(\log m)↑t\bigrp$. It follows that the average value of $R(a)$ taken
oven all $\varphi(m-1)$ primitive roots is $O\biglp t(\log m)↑t/\varphi(m-1)\bigrp$.
$\biglp$In general $1/\varphi(n)=O(\log\log n/n)$; hence
we have proved that {\sl for all
prime $m$ and for all $T$ there exists a primitive root $a$ modulo $m$ such
that the linear congruential sequence $(1,a,0,m)$ has discrepancy $D↓{m-1}
↑{(t)}=O\biglp m↑{-1}T(\log m)↑T\log\log m\bigrp$ for $1≤t≤T$.} This method
of proof does {\sl not} extend to a similar result for linear congruential
generators of period $2↑e$ modulo $2↑e$, since for example the vector
$(1,-3,3,-1)$ solves (15) for about $2↑{2e/3}$ values of $a$.$\bigrp$
%folio 701 galley 4 (C) Addison-Wesley 1978 *
\ansbegin{3.4.1}
\ansno 1. $α + (β - α)U$.
\ansno 2. Let $U = X/m$; $\lfloor kU\rfloor = r$
iff $r ≤ kX/m < r + 1$ iff $mr/k ≤ X < m(r +
1)/k$ iff $\lceil mr/k\rceil ≤ X < \lceil m(r + 1)/k\rceil$.
The exact probability is $(1/m)\biglp\lceil m(r + 1)/k\rceil -
\lceil mr/k\rceil \bigrp = 1/k + \epsilon$, $|\epsilon | < 1/m$.
\ansno 3. If full-word random numbers are given,
the result will be sufficiently random as in exercise 2. But
if a linear congruential sequence is used, $k$ must be relatively
prime to the modulus $m$, lest the numbers have a very short
period, by the results of Section 3.2.1.1. For example,
if $k = 2$ and $m$ is even, the numbers will at
best be alternately 0 and 1.
The method is slower than (1) in nearly every case, so it is
not recommended.
\ansno 4. $\max(X↓1, X↓2) ≤ x$ if and only if $X↓1
≤ x$ and $X↓2 ≤ x$; $\min(X↓1, X↓2) ≥ x$ if and only if $X↓1
≥ x$ and $X↓2 ≥ x$. The probability that two independent events
both happen is the product of the individual probabilities.
\ansno 5. Obtain independent uniform deviates $U↓1$, $U↓2$.
Set $X ← U↓2$. If $U↓1 ≥ p$, set $X ← \max(X, U↓3)$, where
$U↓3$ is a third uniform deviate. If $U↓1 ≥ p + q$, also set
$X ← \max(X, U↓4)$, where $U↓4$ is a fourth uniform deviate.
This method can obviously be generalized to any polynomial,
and indeed even to infinite power series (as shown for example
in Algorithm S, which uses minimization instead of maximization).
Alternatively, we could proceed as follows (suggested
by M. D. MacLaren): If $U↓1 < p$, set $X ← U↓1/p$; otherwise
if $U↓1 < p + q$, set $X ← \max\biglp (U↓1 - p)/q, U↓2\bigrp
$; otherwise set $X ← \max\biglp (U↓1 - p - q)/r, U↓2, U↓3\bigrp
$. This method requires less time than the other to obtain the
uniform deviates, although it involves further arithmetical
operations and it is slightly less stable numerically.
\ansno 6. $F(x) = A↓1/(A↓1 + A↓2)$, where $A↓1$
and $A↓2$ are the areas in Fig$.$ A--4; so
$$F(x) = {\int ↑{x}↓0 \sqrt{1 -
y↑2}\,dy\over\int ↑{1}↓{0}\sqrt{1 - y↑2}\, dy}
= {2\over π} \mathop{\hjust{arcsin}} x + {2\over π} x \sqrt{1
- x↑2}.$$
The probability of termination at step 2 is $p
= π/4$, each time step 2 is encountered, so the number of executions
of step 2 has the geometric distribution. The characteristics
of this number are $\biglp$min 1, ave $4/π$, max $∞$, dev $(4/π)
\sqrt{1 - π/4}\bigrp$, by exercise 17.
\topinsert{\vjust to 42mm{\vfill\baselineskip11pt
\hjust to 50mm{\caption Fig. A--4. Region of ``acceptance'' for the
algorithm of exercise 6.}}}
\ansno 7. If $k=1$ then $n↓1=n$ and the problem is trivial. Otherwise
it is always possible to find $i≠j$ such that $n↓i≤n≤n↓j$. Fill $B↓i$
with $n↓i$ cubes of color $C↓i$ and $n-n↓i$ of color $C↓j$,
then decrease $n↓j$ by $n-n↓i$ and eliminate color $C↓i$. We are
left with the same sort of problem but with $k$ reduced by 1; by
induction, it's possible.
The following algorithm can be used to compute the $P$ and $Y$ tables: Form
a list of pairs $(p↓1,1)\ldotsm(p↓k,k)$ and sort it by first components,
obtaining a list $(q↓1,a↓1)\ldotsm(q↓k,a↓k)$ where $q↓1≤\cdots≤q↓k$ and $n=k$.
Repeat the following operation until $n=0$: Set $P[a↓1-1]←kq↓1$ and
$Y[a↓1-1]←x↓{a↓n}$. Delete $(q↓1,a↓1)$ and $(q↓n,a↓n)$, then insert the
new entry $\biglp q↓n-(1/k-q↓1),a↓n\bigrp$ into its proper place in the list
and decrease $n$ by 1.
(If $p↓j≤1/k$ the algorithm will never put $x↓j$ in
the $Y$ table; this fact is used implicitly in Algorithm M.
The algorithm attempts to maximize the probability that $V<P↓K$ in (3),
by always robbing from the richest remaining element and giving it to the
poorest. However, it is very difficult to determine the absolute minimum of
this probability, since such a task is at least as difficult as the ``bin-packing
problem''; cf.\ Chapter 7.)
\ansno 8. Replace $P↓j$ by $(j+P↓j)/k$ for $0≤j<k$.
\ansno 9. Consider the sign of $f↑{\prime\prime}(x) =
\sqrt{2/π}\,(x↑2 - 1)e↑{-x↑2/2}$.
\ansno 10. Let $S↓j=(j-1)/5$ for $1≤j≤16$ and $p↓{j+15}=F(S↓{j+1})-
F(S↓j)-p↓j$ for $1≤j≤15$; also let $p↓{31}=1-F(3)$ and $p↓{32}=0$.
$\biglp$Eq.\ (15) defines $p↓1$, $\ldotss$, $p↓{15}$.$\bigrp$
The algorithm of exercise 7 can now be used with $k=32$ to compute
$P↓j$ and $Y↓j$, after which we will have $1≤Y↓j≤15$ for $1≤j≤32$. Set
$P↓0←P↓{32}$ (which is 0) and $Y↓0←Y↓{32}$. Then set $Z↓j←1/(5-5P↓j)$ and
$Y↓j←{1\over5}Y↓j-Z↓j$ for $0≤j<32$; $Q↓j←1/(5P↓j)$ for $1≤j≤15$.
Let $h=1/5$ and $f↓{j+15}(x)=\sqrt{2/π}\,\vcenter{\hjust{\:@\char0}}
e↑{-x↑2/2}-e↑{-j↑2/50}\vcenter{\hjust{\:@\char1}}
/p↓{j+15}$
for $S↓j≤x≤S↓j+h$. Then let $a↓j=f↓{j+15}(S↓j)$ for $1≤j≤5$, $b↓j=f↓{j+15}(S↓j)$
for $6≤j≤15$, $b↓j=-hf↑\prime↓{j+15}(S↓j+h)$ for $1≤j≤5$,
and $a↓j=f↓{j+15}(x↓j)+(x↓j-S↓j)b↓j/h$ for $6≤j≤15$, where $x↓j$ is the root
of the equation $f↓{j+15}↑\prime=-b↓j/h$. Finally set $D↓{j+15}←a↓j/b↓j$ for
$1≤j≤15$ and $E↓{j+15}←25/j$ for $1≤j≤5$, $E↓{j+15}←1/(e↑{(2j-1)/50}-1)$
for $6≤j≤15$.
Table 1 was computed while making use of the following intermediate values:
$(p↓1,\ldotss,p↓{31})=(.156$, .147, .133, .116, .097, .078, .060, .044, .032,
.022, .014, .009, .005, .003, .002, .002, .005, .007, .009, .010, .009, .009,
.008, .006, .005, .004, .002, .002, .001, .001, .003); $(x↓6,\ldotss,x↓{15})
=(1.115$, 1.304, 1.502, 1.700, 1.899, 2.099, 2.298, 2.497, 2.697, 2.896);
$(a↓1,\ldotss,a↓{15})=(7.5$, 9.1, 9.5, 9.8, 9.9, 10.0, 10.0, 10.1, 10.1, 10.1,
10.1, 10.2, 10.2, 10.2, 10.2); $b↓1,\ldotss,b↓{15})=(14.9$, 11.7, 10.9, 10.4,
10.1, 10.1, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.7, 10.8, 10.9).
\ansno 11. Let $g(t) = e↑{9/2}te↑{-t↑2/2}$ for $t
≥ 3$. Since $G(x) = \int ↑{x}↓{0} g(t)\,dt = 1 - e↑{-(x↑2-9)/2}
$, a random variable $X$ with density $g$ can be computed
by setting $X ← G↑{-1}(1 - V) = \sqrt{9 -
2 \ln V}$. Now $e↑{-t↑2/2} ≤ (t/3)e↑{-t↑2/2}$
for $t ≥ 3$, so we obtain a valid rejection method if we accept
$X$ with probability $f(X)/cg(X) = 3/X$.
\ansno 12. We have $f↑\prime (x) = xf(x) - 1 < 0$ for $x ≥ 0$,
since $f(x) = x↑{-1} - e↑{x↑2/2} \int ↑{∞}↓{x} e↑{-t↑2/2}\,dt/t↑2$
for $x > 0$. Let $x = a↓{j-1}$ and $y↑2 = x↑2
+ 2 \ln 2$; then $$\textstyle\sqrt{2/π} \int ↑{∞}↓{y}
e↑{-t↑2/2}\,dt = {1\over 2}\sqrt{2/π}
\,e↑{-x↑2/2}f(y) < {1\over 2}\sqrt{2/π}
\,e↑{-x↑2/2}f(x) = 2↑{-j},$$ hence $y > a↓j$.
\ansno 13. Take $b↓j = \mu ↓j$; consider now the problem with
$\mu ↓j = 0$ for each $j$. In matrix notation, if $Y = AX$,
where $A = (a↓{ij})$, we need $AA↑T = C = (c↓{ij})$. [In other
notation, if $Y↓j = \sum a↓{jk}X↓k$, then the average value
of $Y↓iY↓j$ is $\sum a↓{ik}a↓{jk}$.] If this matrix equation
can be solved for $A$, it can be solved when $A$ is triangular,
since $A = BU$ for some orthogonal matrix $U$ and some triangular
$B$, and $BB↑T = C$. The desired triangular solution can be
obtained by solving the equations $a↓{11}↑2 = c↓{11}$,
$a↓{11}a↓{21} = c↓{12}$, $a↓{21}↑2+a↓{22}↑2= c↓{22}$,
$a↓{11}a↓{31} = c↓{13}$, $a↓{21}a↓{31} + a↓{22}a↓{32} = c↓{23}$,
$\ldotss $, successively for $a↓{11}$, $a↓{21}$, $a↓{22}$, $a↓{31}$,
$a↓{32}$, etc. [{\sl Note: } The covariance matrix must
be positive semidefinite, since the average value of $\biglp\sum y↓jY↓j\bigrp↑2$
is $\sum c↓{ij}y↓iy↓j$, which must be nonnegative. And there
is always a solution when $C$ is positive semidefinite, since
$C = U↑{-1}\hjust{diag}(λ↓1, \ldotss , λ↓n)U$, where the eigenvalues
$λ↓j$ are nonnegative, and $U↑{-1}\hjust{diag}(\sqrt{λ↓1},
\ldotss ,\sqrt{λ↓n})U$ is a solution.]
\ansno 14. $F(x/c)$ if $c > 0$, a step function if $c = 0$,
and $1 - F(x/c)$ if $c < 0$.
\ansno 15. Distribution $\int ↑{∞}↓{-∞} F↓1(x - t)\,dF↓2(t)$.
Density $\int ↑{∞}↓{-∞} f↓1(x - t)f↓2(t)\,dt$. This is called
the {\sl convolution} of the given distributions.
%folio 707 galley 5 (C) Addison-Wesley 1978 *
\ansno 16. It is clear that $f(t) ≤ cg(t)$ for all $t$ as required.
Since $\int ↑{∞}↓{0} g(t)\,dt = 1$ we have $g(t) = Ct↑{a-1}$ for
$0 ≤ t < 1$, $Ce↑{-t}$ for $t ≥ 1$, where $C = ae/(a + e)$. A
random variable with density $g$ is easy to obtain as a mixture
of two distributions, $G↓1(x) = x↑{1/a}$ for $0 ≤ x < 1$, and
$G↓2(x) = 1 - e↑{1-x}$ for $x ≥ 1$:
\yyskip\hangindent 38pt\noindent\hjust to 38pt{\hfill\bf G1. }\!
[Initialize.]\xskip Set $p ← e/(a + e)$.
(This is the probability that $G↓1$ should be used.)
\yskip\hangindent 38pt\noindent\hjust to 38pt{\hfill\bf G2. }\!
[Generate $G$ deviate.]\xskip Generate
independent uniform deviates $U$, $V$, where $V ≠ 0$. If $U <
p$, set $X ← V↑{1/a}$ and $q ← e↑{-X}$; otherwise set $X ← 1
- \ln V$ and $q ← X↑{a-1}$. (Now $X$ has density $g$, and $q
= f(X)/cg(X)$.)
\yskip\hangindent 38pt\noindent\hjust to 38pt{\hfill\bf G3. }\!
[Reject?]\xskip Generate a
new uniform deviate $U$. If $U ≥ q$, return to G2.\quad\blackslug
\yyskip\noindent The average number of iterations
is $c = (a + e)/e\Gamma (a + 1) < 1.4$.
It is possible to streamline this procedure in
several ways. First, we can replace $V$ by an exponential deviate
$Y$ of mean 1, generated by Algorithm S, say, and then we set
$X ← e↑{-Y/a}$ or $X ← 1 + Y$ in the two cases. Moreover, if
we set $q ← pe↑{-X}$ in the first case and $q ← p + (1 - p)X↑{a-1}$
in the second, we can use the original $U$ instead of a newly
generated one in step G3. Finally if $U < p/e$ we can accept
$V↑{1/a}$ immediately, avoiding the calculation of $q$ about
30 percent of the time.
\ansno 17. (a) $F(x) = 1 - (1 - p)↑{\lfloor x \rfloor}$, for $x ≥ 0$.\xskip
(b) $G(z) = pz/\biglp 1 - (1 - p)z\bigrp$.\xskip (c) Mean $1/p$,
standard deviation $\sqrt{1-p}/p$. To do the
latter calculation, observe that if $H(z) = q + (1 - q)z$, then
$H↑\prime (1) = 1 - q$ and $H↑{\prime\prime} (1) + H↑\prime (1) - \biglp
H↑\prime (1)\bigrp ↑2 = q(1 - q)$, so the mean and variance
of $1/H(z)$ are $q - 1$ and $q(q - 1)$, respectively. (See Section
1.2.10.) In this case, $q = 1/p$; the extra factor $z$ in the
numerator of $G(z)$ increases the mean by one.
\ansno 18. Set $N ← N↓1 + N↓2 - 1$, where $N↓1$ and $N↓2$
independently have the geometric distribution for probability
$p$. (Consider the generating function.)
\ansno 19. Set $N ← N↓1 + \cdots + N↓t - t$, where the $N↓j$
have the geometric distribution for $p$. (This is the number
of failures before the $t$th success, when a sequence of independent
trials are made each of which succeeds with probability $p$.)
For $t = p = {1\over 2}$, and in general when
the mean value $\biglp$namely $t(1 - p)/p\bigrp$
of the distribution is small, we can simply evaluate the probabilities
$p↓n = {t - 1 + n\choose n}p↑t(1 - p)↑n$ consecutively for $n
= 0$, 1, 2, $\ldots$ as in the following algorithm:
\yyskip\hangindent 38pt\noindent\hjust to 38pt{\hfill\bf B1. }\!
[Initialize.]\xskip Set $N ← 0$, $q ← p↑t$,
$r ← q$, and generate a random uniform deviate $U$. (We will
have $q = p↓N$ and $r = p↓0 + \cdots + p↓N$ during this algorithm,
which stops as soon as $U < r$.)
\yskip\hangindent 38pt\noindent\hjust to 38pt{\hfill\bf B2. }\!
[Search.] If $U ≥ r$, set $N
← N + 1$, $q ← q(1 - p)(t - 1 + N)/N$, $r ← r + q$, and repeat this
step.\quad\blackslug
\yyskip [An interesting technique
for the negative binomial distribution, for arbitrarily large
real values of $t$, has been suggested by R. L\'eger:
First generate a random gamma deviate $X$ of order $t$, then let
$N$ be a random Poisson deviate of mean $X(1 - p)/p$.]
\ansno 20. $\hjust{R1}=1+(1-A/R)\cdot\hjust{R1}$. When R2 is performed,
the algorithm terminates with probability $I/R$; when R3 is performed,
it goes to R1 with probability $E/R$. We have
$$\vjust{\halign{#\qquad⊗\ctr{$#$}\qquad⊗\ctr{$#$}\qquad⊗\ctr{$#$}\qquad
⊗\ctr{$#$}\cr
R1⊗R/A⊗R/A⊗R/A⊗R/A\cr
R2⊗0⊗R/A⊗0⊗R/A\cr
R3⊗0⊗0⊗R/A⊗R/A-I/A\cr
R4⊗R/A⊗R/A-I/A⊗R/A-E/A⊗R/A-I/A-E/A\cr}}$$
\ansno 21. $R=\sqrt{8/e}\approx 1.71153$; $A=\sqrt{π/2}\approx 1.25331$.
Since $$\int u\sqrt{a-bu}\,du=\textstyle
(a-bu)↑{3/2}\left({2\over5}(a-bu)-{2\over3}\right)
/b↑2,$$ we have $I=\int↓0↑{a/b}u\sqrt{a-bu}\,du={4\over15}a↑{5/2}/b↑2$ where $a=4(1+
\ln c)$ and $b=4c$; when $c=e↑{1/4}$, $I$ has its maximum value ${5\over6}\sqrt
{5/e}\approx1.13020$. Finally the following integration formulas are needed for
$E$:$$\eqalign{\int\sqrt{bu-au↑2}\,du⊗
\textstyle={1\over8}b↑2a↑{-3/2}\hjust{arcsin}(2ua/b-1)+
{1\over4}ba↑{-1}\sqrt{bu-au↑2}(2ua/b-1),\cr\noalign{\vskip 2pt}
\int\sqrt{bu+au↑2}\,du⊗\textstyle=-{1\over 8}
b↑2a↑{-3/2}\ln\vcenter{\hjust{\:@\char0}}
\sqrt{bu+au↑2}+u\sqrt a+b/2\sqrt a\vcenter{\hjust{\:@\char1}}\cr
⊗\hskip100pt \textstyle\null+{1\over4}ba↑{-1}\sqrt{bu+au↑2}\,
(2ua/b+1),\cr}$$ where $a,b>0$. Let the test in step R3 be ``$X↑2≥4e↑{x-1}/U-4x$'';
then the exterior region hits the top of the rectangle when $u=r(x)=\biglp e↑x
-\sqrt{e↑{2x}-2ex}\bigrp/2ex$. (Incidentally, $r(x)$ reaches its maximum value
at $x=1/2$, a point where it is {\sl not} differentiable!) We have $E=\int↓0
↑{r(x)}\biglp\sqrt{2/e}-\sqrt{bu-au↑2}\bigrp\,du$ where $b=4e↑{x-1}$ and
$a=4x$. The maximum value of $E$ occurs near $x=-.35$, where we
have $E\approx.29410$.
\ansno 22. (Solution by G. Marsaglia.)\xskip Consider the ``continuous
Poisson distribution'' $G(x) = \int ↑{∞}↓{\mu } e↑{-t}t↑{x-1}
\,dt/\Gamma (x)$, for $x > 0$; if $X$ has this distribution then
$\lfloor X\rfloor$ is Poisson distributed, since $G(x + 1) -
G(x) = e↑{-\mu}\mu↑x/x!$. If $\mu$ is large, $G$ is approximately
normal, hence $G↑{-1}\biglp F↓\mu(x)\bigrp$ is approximately linear,
where $F↓\mu(x)$ is the distribution function for a normal deviate
with mean and variance $\mu$; i.e., $F↓\mu(x)=F\biglp (x-\mu)/\sqrt\mu\bigrp$,
where $F(x)$ is the normal distribution function (10). Let $g(x)$
be an efficiently computable function such that $|G↑{-1}\biglp
F↓\mu(x)\bigrp - g(x)| < \epsilon$ for $-∞ < x < ∞$; we can now
generate Poisson deviates efficiently as follows: Generate a
normal deviate $X$, and set $Y ← g(\mu+\sqrt\mu\,X)$, $N ← \lfloor Y\rfloor
$, $M ← \lfloor Y + {1\over 2}\rfloor $. If $|Y - M| > \epsilon
$, output $N$; otherwise output $M - 1$ or $M$, according as
$G↑{-1}\biglp F(X)\bigrp < M$ or not.
This approach applies also to the binomial distribution,
with $$G(x) = \int ↑{1}↓{p} u↑{x-1}(1 - u)↑{n-x}\, du \Gamma (t
+ 1)/\Gamma (x)\Gamma (t + 1 - x),$$ since $\lfloor G↑{-1}(U)\rfloor$
is binomial with parameters $(t, p)$ and $G$ is approximately
normal.
%folio 710 galley 6 (C) Addison-Wesley 1978 *
\ansno 23. Yes. The second method calculates $|cos 2\theta
|$, where $\theta$ is uniformly distributed between 0 and $π/2$.
(Let $U = r \cos \theta$, $V = r \sin \theta$.)
\ansno 25. ${21\over 32} = (.10101)↓2$. In general,
the binary representation is formed by using 1 for $∨$, 0 for
$∧$, from left to right, then suffixing 1. This technique [cf.
K. D. Tocher, {\sl J. Roy.\ Stat.\ Soc \ \bf B-16} (1954), 49]
can lead to efficient generation of independent bits having
a given probability $p$, and it can also be applied to the geometric and binomial
distributions.
\ansno 26. (a) True, $\sum ↓k\Pr(N↓1 = k)\Pr(N↓2 = n - k)
= e↑{-\mu↓1-\mu↓2}(\mu↓1+\mu↓2)↑n/n!$.\xskip (b)
False, unless $\mu ↓2 = 0$; otherwise $N↓1 - N↓2$ might be
negative.
\ansno 27. Let the binary representation of $p$ be $(.b↓1b↓2b↓3
\ldotsm)↓2$, and proceed as follows.
\yyskip\hangindent 38pt\noindent\hjust to 38pt{\hfill\bf B1. }\!
[Initialize.]\xskip Set $m
← t$, $N ← 0$, $j ← 1$. (During this algorithm, $m$ represents the
number of simulated uniform deviates whose relation to $p$ is
still unknown, since they match $p$ in their leading $j - 1$
bits; and $N$ is the number of simulated deviates known to be
less than $p$.)
\yskip\hangindent 38pt\noindent\hjust to 38pt{\hfill\bf B2. }\!
[Look at next column of bits.]\xskip
Generate a random integer $M$ with the binomial distribution
$(m, {1\over 2})$. (Now $M$ represents the number of unknown
deviates which fail to match $b↓j$.) Set $m ← m - M$, and if
$b↓j = 1$ set $N ← N + M$.
\yskip\hangindent 38pt\noindent\hjust to 38pt{\hfill\bf B3. }\!
[Done?]\xskip If $m = 0$ or if the remaining bits $(.b↓{j+1}b↓{j+2}
\ldotsm)↓2$ of $p$ are all zero, the algorithm terminates. Otherwise, set $j ← j
+ 1$ and return to step B2.\quad\blackslug
\yyskip\noindent [When $b↓j = 1$ for infinitely
many $j$, the average number of iterations $A↓t$ satisfies
$$A↓0 = 0;\qquad A↓n = 1 + {1\over 2↑n} \sum ↓{k}{n\choose k}
A↓k,\quad\hjust{for }n ≥ 1.$$
\vskip-4pt\noindent
Letting $A(z) = \sum A↓nz↑n/n!$, we have $A(z)
= e↑z - 1 + A({1\over 2}z)e↑{z/2}$. Therefore $A(z)e↑{-z} = 1 -
e↑{-z} + A({1\over 2}z)e↑{-z/2} = \sum ↓{k≥0}(1 - e↑{-z/2↑k}
) = 1 - e↑{-z} + \sum ↓{n≥1} z↑n(-1)↑{n+1}/\biglp n! (2↑n - 1)\bigrp$,
and
$$A↓m = 1 + \sum ↓{k≥1} {n\choose k} {(-1)↑{k+1}\over
2↑k - 1} = 1 + {V↓{n+1}\over n + 1} = \lg n + {\gamma \over
\ln 2} + {1\over 2} + f↓0(n) + O(n↑{-1})$$
in the notation of exercise 5.2.2--48.]
\baselineskip 13pt %the next exercise wants loose spacing
\ansno 28. Generate a random point $(y↓1, \ldotss , y↓n)$ on
the unit sphere, and let $\rho = \sqrt{\sum
a↓ky↓k↑2}$. Generate an independent uniform deviate $U$,
and if $\rho ↑{n+1}U < K\sqrt{a↓k↑2y
↓k↑2}$, output the point $(y↓1/\rho , \ldotss , y↓n/\rho )$; otherwise
start over. Here $K↑2 = \min\leftset(\sum a↓ky↓k↑2)↑{n+1}/(\sum
a↓k↑2y↓k↑2)\relv \sum y↓k↑2 = 1\rightset = a↑{n-1}↓{n}$
if $na↓n ≥ a↓1$, $\biglp(n + 1)/(a↓1 + a↓n)\bigrp↑{n+1}(a↓1a↓n/n)↑n$ otherwise.
\baselineskip 11pt % resume normal interline spacing
\ansbegin{3.4.2}
\ansno 1. There are $N - t\choose n-m$
ways to pick $n - m$ records from the last $N - t$; $N-t-1\choose n-m-1$
ways to pick $n - m - 1$ from $N -
t - 1$ after selecting the $(t + 1)$st item.
\ansno 2. Step S3 will never go to step S5 when
the number of records left to be examined is equal to $n - m$.
\ansno 3. We should not confuse ``conditional''
and ``unconditional'' probabilities. The quantity $m$ depends
randomly on the selections which took place among the first
$t$ elements; if we take the average over all possible choices
which could have occurred among these elements, we will find that
$(n - m)/(N - t)$ is exactly $n/N$ on the average. For example,
consider the second element; if the first element was selected
in the sample (this happens with probability $n/N$), the second
element is selected with probability $(n - 1)/(N - 1)$; if the
first element was not selected, the second is selected with
probability $n/(N - 1)$. The overall probability of selecting
the second element is $(n/N)\biglp(n - 1)/(N - 1)\bigrp + (1 - n/N)\biglp n/(N
- 1)\bigrp = n/N$.
\ansno 4. From the algorithm,
$$p(m, t + 1) = \left(1 - {n - m\over N - t}\right)p(m, t) + {n -
(m - 1)\over N - t} p(m - 1, t).$$
The desired formula can be proved by induction
on $t$. In particular, $p(n, N) = 1$.
\ansno 5. In the notation of exercise 4, the probability
that $t = k$ at termination is $q↓k = p(n, k) - p(n, k - 1)
= {k-1\choose n-1}/{N\choose n}$. The average is $\sum↓{0≤k≤N}
kq↓k = (N + 1)n/(n + 1)$.
\ansno 6. Similarly, $\sum↓{0≤k≤N}k(k
+ 1)q↓k = (N + 2)(N + 1)n/(n + 2)$; the variance is therefore
$(N + 1)(N - n)n/(n + 2)(n + 1)↑2$.
\ansno 7. Suppose the choice is $1 ≤ x↓1 < x↓2
< \cdots < x↓n ≤ N$. Let $x↓0 = 0$, $x↓{n+1} = N + 1$. The choice
is obtained with probability $p = \prod ↓{1≤t≤N}p↓t$,
where
$$\vjust{\halign{\lft{$\dispstyle p↓t={#}$}\qquad⊗\hjust{for }$#$\hfill\cr
N-(t-1)-n+m\over N-(t-1)⊗x↓m<t<x↓{m+1};\cr
n-m\over N-(t-1)⊗t=x↓{m+1}.\cr}}$$
The denominator of the product $p$ is $N!$; the numerator
contains the terms $N - n$, $N - n - 1$, $\ldotss$, 1 for those
$t$'s which are not $x$'s, and the terms $n$, $n - 1$, $\ldotss$,
1 for those $t$'s which {\sl are} $x$'s. Hence $p = (N - n)!n!/N!$.\xskip
{\sl Example:} $n = 3$, $N = 8$, $(x↓1, x↓2, x↓3) = (2, 3, 7)$; $p
= {5\over8} {3\over 7} {2\over 6} {4\over 5} {3\over 4} {2\over
3} {1\over 2} {1\over 1}$.
\ansno 9. The reservoir gets seven records: 1,
2, 3, 5, 9, 13, 16. The final sample consists of records 2, 5, 16.
\ansno 10. Delete step R6 and the variable $m$. Replace the $I$ table by
a table of records, initialized to the first $n$ records in step R1,
and with the new record replacing the $M$th table entry in step R4.
\ansno 11. Arguing as in Section 1.2.10, which considers the
special case $n = 1$, we see that the generating function is
$$G(z) = z↑n\left({1\over n + 1} + {n\over n + 1} z\right)\left({2\over
n + 2} + {n\over n + 2} z\right)\ldotsm\left({N - n\over N} + {n\over
N} z\right).$$
The mean is $n + \sum ↓{n<t≤N}(n/t) = n(1
+ H↓N - H↓n)$; and the variance is $n(H↓N - H↓n) - n↑2(H↑{(2)}↓{N}
- H↑{(2)}↓{n})$.
\ansno 12. (Note that $π↑{-1} = (b↓tt) \ldotsm (b↓33)(b↓22)$,
so we seek an algorithm which goes from the representation of
$π$ to that for $π↑{-1}$.) Set $b↓j ← j$ for $1 ≤ j ≤ t$. Then
for $j = 2$, 3, $\ldotss$, $t$ (in this order) interchange $b↓j↔
b↓{a↓j}$. Finally for $j = t$, $\ldotss$, 3, 2 (in this order)
set $b↓{a↓j} ← b↓j$. $\biglp$The algorithm is based on the fact that
$(a↓tt)π↓1 = π↓1(b↓tt)$.$\bigrp$
\ansno 13. Renumbering the deck 0, 1,
$\ldotss$, $2n - 2$, we find that $s$ takes $x$ into $(2x)\mod(2n -
1)$, $c$ takes $x$ into $(x + 1)\mod(2n - 1)$. We have ($c$ followed
by $s) = cs = sc↑2$. Therefore any product of $c$'s and $s$'s
can be transformed into the form $s↑ic↑k$. Also $2↑{\varphi(2n-1)}
≡ 1$ $\biglp$ modulo $(2n - 1)\bigrp$; since $s↑{\varphi(2n-1)}$
and $c↑{2n-1}$ are the identity permutation, at most $(2n -
1)\varphi (2n - 1)$ arrangements are possible. (The {\sl exact}
number of different arrangements is $(2n - 1)k$, where $k$ is
the order of 2 modulo $(2n - 1)$. For if $s↑k = c↑j$, then $c↑j$
fixes the card 0, so $s↑k = c↑j =\null$identity.) For further details,
see {\sl SIAM Review \bf 3} (1961), 293--297.
\ansno 14. Set $Y↓j ← j$ for $t - n < j ≤ t$. Then for $j =
t$, $t - 1$, $\ldotss$, $t - n + 1$ do the following operations: Set
$k ← \lfloor jU\rfloor + 1$. If $k > t - n$ then set $X↓j ←
Y↓k$ and $Y↓k ← Y↓j$; otherwise if $k = X↓i$ for some $i > j$
(a symbol table algorithm could be used), then set $X↓j ← Y↓i$
and $Y↓i ← Y↓j$; otherwise set $X↓j ← k$. (The idea is to let
$Y↓{t-n+1}$, $\ldotss$, $Y↓j$ represent $X↓{t-n+1}$, $\ldotss$, $X↓j$,
and if $i > j$ and $X↓i ≤ t - n$ also to let $Y↓i$ represent
$X↓{X↓i}$, in the execution of Algorithm P.)
%folio 715 galley 7 (C) Addison-Wesley 1978 *
\ansbegin{3.5}
\ansno 1. A $b$-ary sequence, yes (cf.\
exercise 2); a $[\,0,1)$ sequence, no (since only finitely many
values are assumed by the elements).
\ansno 2. It is 1-distributed and 2-distributed,
but not 3-distributed (the binary number 111 never appears).
\ansno 3. Cf.\ exercise 3.2.2--17; repeat the sequence
there with a period of length 27.
\ansno 4. The sequence begins ${1\over 3}$, ${2\over
3}$, ${2\over 3}$, ${1\over 3}$, ${1\over 3}$, ${1\over 3}$,
${1\over 3}$, ${2\over 3}$, ${2\over 3}$, ${2\over 3}$, ${2\over
3}$, ${2\over 3}$, ${2\over 3}$, ${2\over 3}$, ${2\over 3}$,
etc. When $n = 1$, 3, 7, 15, $\ldots$ we have $\nu (n) = 1$, 1,
5, 5, $\ldots$ so that $\nu (2↑{2k-1} - 1) = \nu (2↑{2k} - 1)
= (2↑{2k} - 1)/3$; hence $\nu (n)/n$ oscillates between ${1\over
3}$ and approximately ${2\over 3}$, and no limit exists.
The probability is undefined.
$\biglp$The methods of Section 4.2.4 show, however, that a numerical
value {\sl can} meaningfully be assigned to $$\Pr(U↓n < {1\over
2}) = \Pr (\hjust{leading digit of the radix-4 representation of $n +
1$ is 1}),$$ namely $\log↓4 2 = {1\over 2}$.$\bigrp$
\ansno 5. If $\nu ↓1(n)$, $\nu ↓2(n)$, $\nu ↓3(n)$,
$\nu ↓4(n)$ are the counts corresponding to the four probabilities,
we have $\nu ↓1(n) + \nu ↓2(n) = \nu ↓3(n) + \nu ↓4(n)$ for
all $n$. So the desired result follows by addition of limits.
\def\Pro{\mathop{\overline{\char'120\char'162}}}
\def\Pru{\mathop{\underline{\char'120\char'162}}}
\ansno 6. By exercise 5 and induction,
$$\chop{9pt}{\Pr\biglp S↓j(n)\hjust{ for some }j,\; 1 ≤ j ≤ k\bigrp
= \sum ↓{1≤j≤k} \Pr\biglp S↓j(n)\bigrp .}$$
As $k → ∞$, the latter is a monotone sequence bounded
by 1, so it converges; and
$$\chop{9pt}{\Pru\biglp S↓j(n)\hjust{ for some }j≥1\bigrp≥\sum↓{1≤j≤k}\Pr\biglp
S↓j(n)\bigrp}$$
for all $k$. For a counterexample to equality,
it is not hard to arrange things so that $S↓j(n)$ is always
true for {\sl some} $j$, yet $\Pr\biglp S↓j(n)\bigrp = 0$ for {\sl
all} $j$.
\ansno 7. Let $p↓i = \sum ↓{j≥1}\Pr\biglp S↓{ij}(n)\bigrp
$. The result of the preceding exercise can be generalized to
$\Pru\biglp S↓j(n)\hjust{ for some }j ≥ 1\bigrp ≥ \sum ↓{j≥1}\Pru\biglp
S↓j(n)\bigrp $, for {\sl any} disjoint statements $S↓j(n)$.
So we have $1 = \Pr\biglp S↓{ij}(n)\hjust{ for some }i, j ≥ 1\bigrp
≥ \sum ↓{i≥1}\Pru\biglp S↓{ij}(n)\hjust{ for some }j ≥ 1\bigrp
≥ \sum ↓{i≥1} p↓i = 1$, and hence $\Pru\biglp S↓{ij}(n)\hjust{ for
some }j ≥ 1\bigrp = p↓i$. Given $\epsilon > 0$, let $I$ be large
enough so that $\sum↓{1≤i≤I}p↓i ≥ 1 - \epsilon $.
Let $$\phi ↓i(N) = (\hjust{number of $n<N$ with $S↓{ij}(n)$ true for
some $j ≥ 1$}\bigrp /N.$$ Clearly $\sum ↓{1≤i≤I}
\phi ↓i(N) ≤ 1$, and for all large enough $N$ we have $\sum ↓{2≤i≤I}
\phi ↓i(N) ≥ \sum ↓{2≤i≤I}p↓i - \epsilon $; hence $\phi
↓1(N) ≤ 1 - \phi ↓2(N) - \cdots - \phi ↓I(N) ≤ 1 - p↓2 - \cdots
- p↓I + \epsilon ≤ 1 - (1 - \epsilon - p↓1) + \epsilon = p↓1
+ 2\epsilon $. This proves that $\Pro\biglp S↓{1j}(n)\hjust{for some }j ≥ 1\bigrp
≤ p↓1 + 2\epsilon $; hence $\Pr\biglp S↓{1j}(n)\hjust{for some }j ≥ 1\bigrp
= p↓1$, and the desired result holds for $i = 1$. By symmetry
of the hypotheses, it holds for any value of $i$.
\ansno 8. Add together the probabilities
for $j$, $j + d$, $j + 2d$, $\ldots$ in Definition E.
\ansno 9. $\limsup↓{n→∞}(a↓n+b↓n)≤\limsup↓{n→∞}a↓n+\limsup↓{n→∞}b↓n$; hence we get
$$\chop{9pt}{\limsup↓{n→∞} \biglp (y↓{1n} - α)↑2 + \cdots + (y↓{mn}
- α)↑2\bigrp ≤ mα↑2 - 2mα↑2 + mα↑2 = 0,}$$
and this can happen only if each $(y↓{jn} - α)$ tends to zero.
\ansno 10. In the evaluation of the sum in Eq$.$ (22).
\ansno 11. $\langle U↓{2n}\rangle$ is $k$-distributed if $\langle
U↓n\rangle$ is $(2, 2k)$-distributed.
\ansno 12. Let $f(x↓1, \ldotss , x↓k) = 1$ if $u ≤ \max(x↓1,
\ldotss , x↓k) < v$; $f(x↓1, \ldotss , x↓k) = 0$ otherwise. Then
apply Theorem B.
\ansno 13. Let
$$\eqalign{p↓k⊗ = \Pr(U↓n \hjust{ begins a gap of length }k-1)\cr
⊗=\Pr\biglp U↓{n-1} \in [α, β),\ U↓n \notin [α, β),\ \ldotss
,\ U↓{n+k-2} \notin [α, β),\ U↓{n+k-1} \in [α,β)\bigrp\cr
⊗=p↑2(1-p)↑{k-1}.\cr}$$
%folio 717 galley 8 (C) Addison-Wesley 1978 *
It remains to translate this into the probability that $f(n)
- f(n - 1) = k$. Let $\nu ↓k(n) = \biglp \hjust{number of $j ≤ n$ with $f(j)
- f(j - 1) = k$}\bigrp $; let $\mu ↓k(n) = ($number of $j ≤ n$
with $U↓j$ the beginning of a gap of length $k - 1)$; and let
$\mu (n)$ similarly count the number of $1 ≤ j ≤ n$ with $U↓j
\in [α, β)$. We have $\mu↓k\biglp f(n)\bigrp=\nu ↓k(n)$,
$\mu\biglp f(n)\bigrp=n$. As $n → ∞$, we must have $f(n)→∞$, hence
$$\nu ↓k(n)/n = \biglp \mu ↓k(f(n))/f(n)\bigrp \cdot \biglp
f(n)/\mu (f(n))\bigrp → p↓k/p = p(1 - p)↑{k-1}.$$
[We have only made use of the fact that the sequence is
$(k + 1)$-distributed.]
\ansno 14. Let
$$\eqalign{p↓k⊗ = \Pr(U↓n\hjust{ begins a run of length }k)\cr\noalign{\vskip3pt}
⊗=\Pr(U↓{n-1} > U↓n < \cdots < U↓{n+k-1} > U↓{n+k})\cr\noalign{\vskip3pt}
⊗= {1\over (k + 2)!}\biggglp{k+2\choose1}{k+1\choose1}-{k+2\choose1}
-{k+2\choose1}+1\bigggrp\cr\noalign{\vskip3pt}
⊗= {k\over (k + 1)!} - {k + 1\over (k + 2)!}\cr}$$
(cf.\ exercise 3.3.2--13). Now proceed as in the previous
exercise to transfer this to $\Pr\biglp f(n) - f(n - 1) = k\bigrp$.
[We have assumed only that the sequence is $(k + 2)$-distributed.]
\ansno 15. For $s, t ≥ 0$ let
$$\eqalign{p↓{st}⊗ = \Pr(X↓{n-2t-3} = X↓{n-2t-2} ≠ X↓{n-2t-1} ≠
\cdots ≠ X↓{n-1}\cr⊗\hskip 45mm\hjust{and }X↓n= \cdots = X↓{n+s} ≠ X↓{n+s+1}\cr
\noalign{\vskip3pt} ⊗=\textstyle ({1\over 2})↑{s+2t+3};\cr}$$
for $t ≥ 0$ let $q↓t = \Pr(X↓{n-2t-2} = X↓{n-2t-1}
≠ \cdots ≠ X↓{n-1}) = ({1\over 2})↑{2t+1}$. By exercise 7, $\Pr(X↓n$
is not the beginning of a coupon set$) = \sum ↓{t≥0} q↓t = {2\over
3}$; $\Pr(X↓n$ is the beginning of coupon set of length $s + 2) =
\sum ↓{t≥0} p↓{st} = {1\over 3}({1\over 2})↑{s+1}$. Now proceed
as in exercise 13.
\ansno 16. (Solution by R. P. Stanley.)\xskip Whenever the subsequence
$S = (b - 1)$, $(b - 2)$, $\ldotss$, 1, 0, 0, 1, $\ldotss$, $(b - 2)$,
$(b - 1)$ appears, a coupon set must end at the right of $S$,
since some coupon set is completed in the first half of $S$.
We now proceed to calculate the probability that a coupon set
begins at position $n$ by manipulating the probabilities that
the last prior appearance of $S$ ends at position $n - 1$, $n
- 2$, etc., as in exercise 15.
\ansno 18. Proceed as in the proof of Theorem A to calculate
$\underline{\hjust{Pr}}$ and $\overline{\hjust{Pr}}$.
\ansno 19. (Solution by T. Herzog.)\xskip Yes; e.g., the sequence
$\langle U↓{\lfloor n/2\rfloor}\rangle$ when $\langle U↓n\rangle$
satsifies R4 (or even its weaker version), cf.\ exercise 33.
\ansno 21. $\Pr(Z↓n \in M↓1, \ldotss , Z↓{n+k-1} \in M↓k) = p(M↓1)
\ldotsm p(M↓k)$, for all $M↓1$, $\ldotss$, $M↓k \in \Mscr$.
\ansno 22. If the sequence is $k$-distributed, the limit is
zero by integration and Theorem B. Conversely, note that if
$f(x↓1, \ldotss , x↓k)$ has an absolutely convergent Fourier
series
$$f(x↓1, \ldotss , x↓k) = \sum ↓{-∞<c↓1, \ldotss , c↓k<∞} a(c↓1,
\ldotss , c↓k)\exp\biglp2πi(c↓1x↓1 + \cdots + c↓kx↓k)\bigrp ,$$
we have $\lim↓{N→∞} {1\over N} \sum ↓{0≤n<N} f(U↓n,
\ldotss , U↓{n+k-1}) = a(0, \ldotss , 0) + \epsilon ↓r$, where
$$|\epsilon ↓r| ≤ \sum ↓{|c↓1|, \ldotss , |c↓k|>r} |a(c↓1, \ldotss, c↓k)|,$$
so $\epsilon ↓r$ can be made arbitrarily small.
Hence this limit is equal to $$a(0, \ldotss , 0) = \int ↑{1}↓{0}
\cdots \int ↑{1}↓{0} f(x↓1, \ldotss , x↓k)\, dx↓1 \ldotsm dx↓k,$$ and
Eq$.$ (8) holds for all sufficiently smooth functions $f$.
The remainder of the proof shows that the function in (9) can
be approximated by smooth functions to any desired accuracy.
\ansno 23. See {\sl AMM \bf 75} (1968), 260--264.
\ansno 24. This follows immediately from exercise 22.
\ansno 25. If the sequence is equidistributed, the denominator
in Corollary S approaches ${1\over 12}$, and the numerator approaches
the quantity in this exercise.
\ansno 26. See {\sl Math$.$ Comp$.$ \bf 17} (1963), 50--54. [Consider
also the following example by A. G. Waterman: Let $\langle U↓n\rangle$
be an equidistributed $[\,0, 1)$ sequence and $\langle X↓n\rangle$
an $∞$-distributed binary sequence. Let $V↓n = U↓{\lceil\sqrt n\,\rceil}$
or $1 - U↓{\lceil\sqrt n\,\rceil}$ according as $X↓n$ is 0 or 1. Then
$\langle V↓n\rangle$ is equidistributed and white, but $\Pr(V↓n
= V↓{n+1}) = {1\over 2}$. Let $W↓n = (V↓n-ε↓n)\mod 1$
where $\langle \epsilon ↓n\rangle$ is any sequence that decreases
monotonically to 0; then $\langle W↓n\rangle$ is equidistributed
and white, yet $\Pr(W↓n < W↓{n+1}) = {3\over 4}.]$
\ansno 28. Let $\langle U↓n\rangle$ be $∞$-distributed, and
consider the sequence $\langle {1\over 2}(X↓n + U↓n)\rangle $.
This is 3-distributed, using the fact that $\langle U↓n\rangle$
is $(16, 3)$-distributed.
\ansno 29. If $x = x↓1x↓2 \ldotsm x↓t$ is any binary number,
we can consider the number $\nu ↑{E}↓{x}(n)$ of times $X↓p \ldotsm
X↓{p+t-1} = x$, where $1 ≤ p ≤ n$ and $p$ is even. Similarly,
let $\nu ↑{O}↓{x}(n)$ count the number of times when $p$ is
odd. Let $\nu ↑{E}↓{x}(n) + \nu ↑{O}↓{x}(n)=\nu↓x(n)$. Now
\def\\{\char'403 }
$$\nu ↑{E}↓{0}(n) = \sum \nu ↑{E}↓{0\\\\\ldots\\}(n) \approx
\sum \nu ↑{O}↓{\\0\\\ldots\\}(n)
\approx \sum \nu ↑{E}↓{\\\\0\ldots\\}
(n) \approx \cdots \approx \sum \nu ↑{O}↓{\\\\\\\ldots0}(n)$$
where the $\nu $'s in these summations have $2k$
subscripts, $2k-1$ of which are asterisks (meaning that they are being
summed over---each sum is taken over $2↑{2k-1}$ combinations of zeros and ones),
and where ``$\approx$'' denotes
approximate equality (except for an error of at most $2k$ due
to end conditions). Therefore we find that
$$\twoline{{1\over n} 2k\nu ↑{E}↓{0}(n) = {1\over n} \left(\sum
\nu ↓{\\0\\\ldots\\}(n)+\cdots+\sum\nu↓{\\\\\\\ldots0}(n)\right)}{3pt}{
\null+{1\over n}
\chop{9pt}{\sum↓x\biglp r(x)-s(x)\bigrp\nu↓x↑E(n)}+O\left(1\over n\right),}$$
where $x = x↓1 \ldotsm x↓{2k}$ contains $r(x)$ zeros
in odd positions and $s(x)$ zeros in even positions. By $(2k)$-distribution,
the parenthesized quantity tends to $k(2↑{2k-1})/2↑{2k} = k/2$.
The remaining sum is clearly a maximum if $\nu ↑{E}↓{x}(n) =
\nu ↓x(n)$ when $r(x) > s(x)$, and $\nu ↑{E}↓{x}(n) = 0$ when
$r(x) < s(x)$. So the maximum of the right-hand side becomes
$${k\over 2} + \sum ↓{0≤s<r≤k} (r - s)\left.{k\choose r}{k\choose s}\right/
2↑{2k} = {k\over 2} + k\left.{2k-1\choose k}\right/2↑{2k}.$$
%folio 720 galley 9 (C) Addison-Wesley 1978 *
Now $\overline{\hjust{Pr}}(X↓{2n} = 0) ≤ \limsup↓{n→∞}\nu ↑{E}↓{0}(2n)/n$,
so the proof is complete.\xskip {\sl Note:}
$$\eqalign{\noalign{\vskip 4pt}
\sum ↓{r,s}{n\choose r}{n\choose s}\max(r,s)⊗=
2n2↑{2n-2} + n{2n-1\choose n};\cr
\sum ↓{r,s}{n\choose r}{n\choose s}\min(r,s)⊗=
2n2↑{2n-2} - n{2n-1\choose n}.\cr}$$
\ansno 30. Let $f(x↓1, x↓2, \ldotss , x↓{2k}) = \hjust{sign}
(x↓1 - x↓2 + x↓3 - x↓4 + \cdots - x↓{2k})$. Construct a
directed graph with $2↑{2k}$ nodes labeled $(E; x↓1, \ldotss
, x↓{2k-1})$ and $(O; x↓1, \ldotss , x↓{2k-1})$, where each $x$
is either 0 or 1. Let there be $1 + f(x↓1, x↓2, \ldotss , x↓{2k})$
directed arcs from $(E; x↓1, \ldotss , x↓{2k-1})$ to $(O; x↓2,
\ldotss , x↓{2k})$, and $1 - f(x↓1, x↓2, \ldotss , x↓{2k})$ directed
arcs leading from $(O; x↓1, \ldotss , x↓{2k-1})$ to $(E; x↓2,
\ldotss , x↓{2k})$. We find that each node has the same number of
arcs leading into it as those leading out; for example, $(E;
x↓1, \ldotss , x↓{2k-1})$ has $1 - f(0, x↓1, \ldotss , x↓{2k-1})
+ 1 - f(1, x↓1, \ldotss , x↓{2k-1})$ leading in and $1 + f(x↓1,
\ldotss , x↓{2k-1}, 0) + 1 + f(x↓1, \ldotss , x↓{2k-1}, 1)$ arcs
leading out, and $f(x, x↓1, \ldotss , x↓{2k-1}) = -f(x↓1, \ldotss
, x↓{2k-1}, x)$. Drop all nodes which have no paths leading
either in or out, i.e., $(E; x↓1, \ldotss , x↓{2k-1})$ if $f(0,
x↓1, \ldotss , x↓{2k-1}) = +1$, or $(O; x↓1, \ldotss , x↓{2k-1})$
if $f(1, x↓1, \ldotss , x↓{2k-1}) = -1$. The resulting directed
graph is seen to be connected, since we can get from any node
to $(E; 1, 0, 1, 0, \ldotss , 1)$ and from this to any desired
node. By Theorem 2.3.4.2G, there is a cyclic path traversing
each arc; this path has length $2↑{2k+1}$, and we may assume
that it starts at node $(E; 0, \ldotss , 0)$. Construct a cyclic
sequence with $X↓1 = \cdots = X↓{2k-1} = 0$, and $X↓{n+2k-1}
= x↓{2k}$ if the $n$th arc of the path is from $(E; x↓1, \ldotss
, x↓{2k-1})$ to $(O; x↓2, \ldotss , x↓{2k})$ or from $(O; x↓1,
\ldotss , x↓{2k-1})$ to $(E; x↓2, \ldotss , x↓{2k})$. For example,
the graph for $k = 2$ is shown in Fig.\ A--5; the arcs of the
cyclic path are numbered from 1 to 32, and the cyclic sequence
is (00001000110010101001101110111110)(00001$\ldotsm$). Note that
$\Pr(X↓{2n} = 0) = {11\over 16}$ in this sequence. The sequence
is clearly $(2k)$-distributed, since each $(2k)$-tuple $x↓1x↓2
\ldotsm x↓{2k}$ occurs $1 + f(x↓1, \ldotss , x↓{2k}) + 1 - f(x↓1,
\ldotss , x↓{2k}) =2$ times in the cycle. The fact that $\Pr(X↓{2n}
= 0)$ has the desired value comes from the fact that the maximum
value on the right-hand side in the proof of the preceding exercise
has been achieved by this construction.
\topinsert{\vskip 63mm
\ctrline{\caption Fig. A--5. Directed graph for the
construction in exercise 30.}}
\ansno 31. Use Algorithm W with rule $\Rscr↓1$ selecting the entire sequence.
[For a generalization of this
type of nonrandom behavior in R5-sequences, see Jean Ville,
{\sl \'Etude Critique de la notion de Collectif} (Paris,
1939), 55--62. Perhaps R6 is also too weak, from this standpoint.]
\ansno 32. If $\Rscr,\Rscr↑\prime$ are computable subsequence rules,
so is $\Rscr↑{\prime\prime} = \Rscr\Rscr↑\prime$ defined by the following
functions: $f↓n↑{\prime\prime}(x↓0, \ldotss , x↓{n-1}) = 1$ iff
$\Rscr$ defines the subsequence $x↓{r↓1}$, $\ldotss
$, $x↓{r↓k}$ of $x↓0$, $\ldotss$, $x↓{n-1}$, where $k ≥ 0$ and
$0 ≤ r↓1 < \cdots < r↓k < n$ and $f↓k↑\prime (x↓{r↓1},
\ldotss, x↓{r↓k}) = 1$.
Now $\langle X↓n\rangle\Rscr\Rscr↑\prime$ is $\biglp \langle
X↓n\rangle\Rscr\bigrp\Rscr↑\prime $. The result follows immediately.
\ansno 33. Given $\epsilon > 0$, find $N↓0$ such that $N > N↓0$
implies that both $|\nu ↓r(N)/N - p| < \epsilon$ and $|\nu ↓s(N)/N
- p| < \epsilon $. Then find $N↓1$ such that $N > N↓1$ implies
that $t↓N$ is $r↓M$ or $s↓M$ for some $M > N↓0$. Now $N > N↓1$
implies that
$$\left| {\nu ↓t(N)\over N} - p\right| = \left| {\nu ↓r(N↓r)
+ \nu ↓s(N↓s)\over N} - p\right| = \left| {\nu ↓r(N↓r) - pN↓r
+ \nu ↓s(N↓s) - pN↓s\over N↓r + N↓s}\right|<ε.$$
\ansno 34. For example, if the binary representation
of $t$ is $(1\ 0↑{b-2}\ 1\ 0↑{a↓1}\ 1\ 1\ 0↑{a↓2}\ 1\ \ldots\
1\ 0↑{a↓k})↓2$, where ``$0↑a$'' stands for a sequence of $a$
consecutive zeros, let the rule $\Rscr↓t$ accept $U↓n$ if and
only if $\lfloor bU↓{n-k}\rfloor = a↓1$, $\ldotss$, $\lfloor bU↓{n-1}\rfloor
= a↓k$.
\ansno 35. Let $a↓0 = s↓0$ and $a↓{m+1} = \max\leftset s↓k \relv 0 ≤ k
< 2↑{a↓m}\rightset$. Construct a subsequence rule that selects
element $X↓n$ if and only if $n = s↓k$ for some $k < 2↑{a↓m}$,
when $a↓m ≤ n < a↓{m+1}$. Then $\lim↓{m→∞}\nu (a↓m)/a↓m
= {1\over 2}$.
\ansno 36. Let $b$ and $k$ be arbitrary but fixed integers greater
than 1. Let $Y↓n = \lfloor bU↓n\rfloor $. An arbitrary infinite
subsequence $\langle Z↓n\rangle = \langle Y↓{s↓n}\rangle\Rscr$
determined by algorithms $\Sscr$ and $\Rscr$ (as in the proof of
Theorem M) corresponds in a straightforward but notationally
hopeless manner to algorithms $\Sscr↑\prime$ and $\Rscr↑\prime$ which
inspect $X↓t$, $X↓{t+1}$, $\ldotss$, $X↓{t+s}$ and/or select $X↓t$,
$X↓{t+1}$, $\ldotss$, $X↓{t+\min(k-1,s)}$ of $\langle X↓n\rangle$
if and only if $\Sscr$ and $\Rscr$ inspect and/or select $Y↓s$, where
$U↓s = (0.X↓tX↓{t+1} \ldotsm X↓{t+s})↓2$. Algorithms $\Sscr↑\prime$ and
$\Rscr↑\prime$ determine an infinite 1-distributed subsequence of
$\langle X↓n\rangle$ and in fact (as in exercise 32) this
subsequence is $∞$-distributed so it is $(k, 1)$-distributed.
Hence we find that $\underline{\hjust{Pr}}(Z↓n = a)$ and
$\overline{\hjust{Pr}}(Z↓n=a)$ differ from $1/b$ by less than $1/2↑k$.
[The result of this exercise is true if ``R6'' is replaced
consistently by ``R4'' or ``R5''; but it is false if ``R1'' is used, since
$X↓{n\choose2}$ might be identically zero.]
\ansno 37. For $n ≥ 2$ replace $U↓{n↑2}$ by ${1\over 2}(U↓{n↑2}
+ \delta ↓n)$, where $\delta ↓n = 0$ or 1 according as the set $\{U↓{(n-1)↑2+1},
\ldotss , U↓{n↑2-1}\}$ contains an even or odd
number of elements less than ${1\over 2}$. [{\sl Advances in
Math$.$ \bf 14} (1974), 333--334.]
\ansno 39. See {\sl Acta Arithmetica \bf 21} (1972), 45--50.
The best possible value of $c$ is unknown.
\ansno 40. If every one-digit change to a random table yields
a random table, all tables are random (or none are). If we don't
allow degrees of randomness, the answer must therefore be, ``Not
always.''
%folio 722 galley 10a (C) Addison-Wesley 1978 *
\ansbegin{3.6}
\mixans 1. {⊗RANDI⊗STJ⊗9F⊗Store exit location.\cr
⊗⊗STA⊗8F⊗Store value of $k$.\cr
⊗⊗LDA⊗XRAND⊗$\rA ← X$.\cr
⊗⊗MUL⊗7F⊗$\rAX ← aX$.\cr
⊗⊗INCX⊗1009⊗$\rX ← (aX + c)\mod m$.\cr
⊗⊗SLAX⊗5⊗$\rA ← (aX+c)\mod m$.\cr
⊗⊗STA⊗XRAND⊗Store $X$.\cr
⊗⊗MUL⊗8F⊗$\rA ← \lfloor kX/m\rfloor$.\cr
⊗⊗INCA⊗1⊗Add 1, so that $1 ≤ Y ≤k$.\cr
⊗9H⊗JMP⊗*⊗Return.\cr
⊗XRAND⊗CON⊗1⊗Value of $X$; $X↓0 = 1$.\cr
⊗8H⊗CON⊗0⊗Temp storage of $k$.\cr
⊗7H⊗CON⊗3141592621⊗The multiplier $a$.\quad\blackslug\cr}
\ansno 2. Putting a random-number generator into a program
makes the results essentially unpredictable to the programmer.
If the behavior of the machine on each problem were known in
advance, few programs would ever be written. As Turing has said,
the actions of a computer quite often {\sl do} surprise its programmer,
especially when a program is being debugged.
So the world had better watch out.
%folio 723 galley 10b (C) Addison-Wesley 1978 *
\ansbegin{4.1}
\ansno 1. 1010, 1011, 1000, $\ldotss$,
11000, 11001, 11110.
\ansno 2. (a) $-110001$, $-11.001001001001 \ldotss
$, $11.0010010000111111 \ldotss\,$.
(b) $11010011$, $1101.001011001011 \ldotss$ , $111.011001000100000
\ldotss\,$.
\def\≡{\overline1}
(c) $\≡11\≡\≡$, $\≡0.0\≡\≡0110\≡\≡011 \ldotss
$, $10.011\≡111\≡ \ldotss\,$.
(d) $-9.4$, $- \ldotsm 7582417582413$, $\ldotss 562951413$.
\ansno 3. $(1010113.2)↓{2i}$.
\ansno 4. (a) Between rA and rX. (b) The remainder
in rX has radix point between bytes 3 and 4; the quotient in
rA has radix point one byte to the right of the least significant
portion of the register.
\ansno 5. It has been subtracted from $999 \ldotsm
9 = 10↑p - 1$, instead of from $1000 \ldotsm 0 = 10↑p$.
\ansno 6. (a,c) $2↑{p-1} - 1$, $-(2↑{p-1} - 1)$;\xskip (b) $2↑{p-1}
- 1$, $-2↑{p-1}$.
\ansno 7. A ten's complement representation for
a negative number $x$ can be obtained by considering $10↑n +
x$ (where $n$ is large enough for this to be positive) and extending
it on the left with infinitely many nines. The nines' complement
representation can be obtained in the usual manner. (These two
representations are equal for nonterminating decimals, otherwise
the nines' complement representation has the form $\ldotss (a)99999
\ldots$ while the ten's complement representation has the form
$\ldotss (a + 1)0000 \ldotss\,$.) The representations may be
considered sensible if we regard the value of the infinite sum
$N = 9 + 90 + 900 + 9000 + \cdots$ as $-1$, since $N
- 10N = 9$.
See also exercise 31, which considers $p$-adic
number systems. The latter agree with the $p$'s complement notations
considered here, for numbers whose radix-$p$ representation
is terminating, but there is no simple relation between the
field of $p$-adic numbers and the field of real numbers.
\ansno 8. $\sum ↓j a↓jb↑j = \sum ↓j (a↓{kj+k-1}b↑{k-1}
+ \cdots + a↓{kj})b↑{kj}$.
\ansno 9. {\tt A BAD AD0BE FACADE FADED}.\xskip [{\sl Note:}
Other possible ``number sentences'' would be {\tt D0} {\tt A} {\tt DEED} {\tt A}
{\tt DECADE}; {\tt A} {\tt CAD} {\tt FED} {\tt A} {\tt BABE} {\tt BEEF},
{\tt C0C0A}, {\tt C0FFEE}; {\tt B0B} {\tt FACED} {\tt A} {\tt DEAD} {\tt D0D0}.]
%folio 725 galley 1 (C) Addison-Wesley 1978 *
\ansno 10. $\dispstyle
\left[\vcenter{\halign{\rt{$#$}⊗\rt{$\,#$}⊗\rt{$\,#$}⊗\rt{$\,
#$}⊗\rt{$\,#$}⊗\rt{$\;#$}⊗\rt{$\;#\ldotsm$}\cr
\ldotsm,⊗a↓3,⊗a↓2,⊗a↓1,⊗a↓0;⊗a↓{-1},⊗a↓{-2},\cr
\ldotsm,⊗b↓3,⊗b↓2,⊗b↓1,⊗b↓0;⊗b↓{-1},⊗b↓{-2},\cr}}\right]=
\left[\vcenter{\halign{\rt{$#$}⊗\rt{$\,#$}⊗\rt{$\,#$}⊗\rt{$\,
#$}⊗\rt{$\,#$}⊗\rt{$\;#$}⊗\rt{$\;#\ldotsm$}\cr
\ldotsm,⊗A↓3,⊗A↓2,⊗A↓1,⊗A↓0;⊗A↓{-1},⊗A↓{-2},\cr
\ldotsm,⊗B↓3,⊗B↓2,⊗B↓1,⊗B↓0;⊗B↓{-1},⊗B↓{-2},\cr}}\right]$, if
$$A↓j=\left[\vcenter{\halign{\rt{$# $}⊗\rt{$\,#,\ldotss,$}⊗\rt{$\,#$}\cr
a↓{k↓{j+1}-1},⊗a↓{k↓{j+1}-2}⊗a↓{k↓j}\cr
⊗b↓{k↓{j+1}-2}⊗b↓{k↓j}\cr}}\right],\qquad B↓j=b↓{k↓{j+1}-1}\ldotsm b↓{k↓j},$$
where $\langle k↓n\rangle$ is any infinite sequence
of integers with $k↓{j+1} > k↓j$.
\ansno 11. (The following algorithm works both for addition
or subtraction, depending on whether the plus or minus sign
is chosen.)
Start by setting $k ← a↓{n+1} ← a↓{n+2} ← b↓{n+1}
← b↓{n+2} ← 0$; then for $m = 0$, 1, $\ldotss$, $n + 2$ do the following:
Set $c↓m ← a↓m \pm b↓m + k$; then if $c↓m ≥ 2$, set $k ← -1$
and $c↓m ← c↓m - 2$; otherwise if $c↓m < 0$, set $k ← 1$ and $c↓m ← c↓m
+ 2$; otherwise (i.e., if $0 ≤ c↓m ≤ 1$), set $k ← 0$.
\ansno 12. (a) Subtract $\pm(\ldotsm a↓30a↓10)↓{-2}$ from $\pm
(\ldotsm a↓40a↓20a↓0)↓{-2}$
in the negabinary system. (See also exercise 7.1--18 for a trickier
solution.)\xskip (b) Subtract $(\ldotsm b↓30b↓10)↓2$
from $(\ldotsm b↓40b↓20b↓0)↓2$ in the binary system.
\ansno 13. $(1.909090\ldotsm)↓{-10} = (0.090909\ldotsm)↓{-10} = {1\over 11}$.
\ansno 14. \hjust to 329pt{$\def\\{\hskip 4.5pt}\hskip0pt plus 200pt
\eqalign{1\\1\\3\\2\\1⊗\qquad[5-4i]\cr
1\\1\\3\\2\\1⊗\qquad[5-4i]\cr
\noalign{\vskip -7.6pt}
\vjust{\hrule width 40.5pt}\cr
\noalign{\vskip-1.75pt}
1\\1\\3\\2\\1\cr
1\\1\\2\\0\\2\\\\\cr
1\\2\\1\\2\\3\\\\\\\\\cr
1\\1\\3\\2\\1\\\\\\\\\\\\\cr
1\\1\\3\\2\\1\\\\\\\\\\\\\\\\\cr
\noalign{\vskip 3pt\hrule width 76.5pt\vskip 3pt}
0\\1\\0\\3\\1\\1\\2\\0\\1⊗\qquad[9-40i]\cr}\hskip0pt plus 200pt$}
\ansno 15. $[-{10\over 11}, {1\over 11}]$, and the rectangle
shown in Fig.\ A--6.
\topinsert{\vskip 40mm
\moveright 190pt\hjust to 128pt{\caption Fig. A--6.
Fundamental region for quater-imaginary numbers.}}
\ansno 16. It is tempting to try to do this in a very simple
way, by using the rule $2 = (1100)↓{i-1}$ to take care of carries;
but that leads to a nonterminating method if, for example, we try to add 1
to $(11101)↓{i-1} = -1$.
The following solution does the job by providing
four related algorithms (namely for adding or subtracting 1
or $i$). If $α$ is a string of zeros and ones, let $α↑P$ be
a string of zeros and ones such that $(α↑P)↓{i-1} = (α)↓{i-1}
+ 1$; and let $α↑{-P}$, $α↑Q$, $α↑{-Q}$ be defined similarly, with
$-1$, $+i$, and $-i$ respectively in place of $+1$. Then
$$\eqalign{(α0)↑P⊗=α1;\cr(αx0)↑{-P}⊗=α↑{-Q}x1;\cr}\quad
\eqalign{(αx1)↑P⊗=α↑Qx0.\cr(α1)↑{-P}⊗=α0.\cr}\qquad\qquad
\eqalign{(α0)↑Q⊗=α↑P1;\cr(α0)↑{-Q}⊗=α↑Q1;\cr}\quad
\eqalign{(α1)↑Q⊗=α↑{-Q}0.\cr(α1)↑{-Q}⊗=α↑{-P}0.\cr}$$
Here $x$ stands for either 0 or 1, and the strings are
extended on the left with zeros if necessary. The processes
will clearly always terminate. Hence every number of the form
$a + bi$ with $a$ and $b$ integers is representable in the $i - 1$
system.
\ansno 17. No (in spite of exercise 28); the number $-1$ cannot be so represented.
This can be proved by constructing a set $S$ as in Fig.\ 1. We do have
the representations $-i = (0.1111\ldotsm)↓{1+i}$, $i = (100.1111\ldotsm
)↓{1+i}$.
\lineskip 0pt
\ansno 18. Let $S↓0$ be the set of points $(a↓7a↓6a↓5a↓4a↓3a↓2a↓1a↓0)↓{i-1}$,
where each $a↓k$ is 0 or 1. (Thus, $S↓0$ is given by the 256 interior
dots shown in Fig.\ 1, if that picture is multiplied by 16.)
We first show that $S$ is closed: If $y↓1$, $y↓2$, $\ldots$ is an
infinite subset of $S$, we have $y↓n = \sum ↓{k≥1\lower 1pt\null} a↓{nk}16↑{-k}$,
where each $a↓{nk}$ is in $S↓0$. Construct a tree whose nodes
are $(a↓{n1}, \ldotss , a↓{nr})$, for $1 ≤ r ≤ n$, and let a
node of this tree be an ancestor of another node if it is an
initial subsequence of that node. By the infinity lemma this
tree has an infinite path $(a↓1, a↓2, a↓3, \ldotss)$, and it follows that $\sum
↓{k≥1} a↓k16↑{-k}$ is a limit point of $\{y↓1, y↓2, \ldotss\}$
in $S$.
\lineskip 1pt
By the answer to exercise 16, all numbers of the
form $(a + bi)/16↑k$ are representable, when $a$ and $b$ are
integers. Therefore if $x$ and $y$ are arbitrary reals and $k
≥ 1$, the number $z↓k = (\lfloor 16↑kx\rfloor + \lfloor 16↑ky\rfloor
i)/16↑k$ is in $S + m + ni$ for some integers $m$ and $n$. It
can be shown that $S + m + ni$ is bounded away from the origin
when $(m, n) ≠ (0, 0)$. Consequently if $|x|$ and $|y|$ are
fixed and $k$ is sufficiently large, we have $z↓k
\in S$, and $\lim↓{k→∞}z↓k = x + yi$ is in $S$.
\ansno 19. If $m > u$ or $m < l$, find $a \in D$ such that $m
≡ a \modulo b$; the desired representation will be a representation
of $m↑\prime = (m - a)/b$ followed by $a$. Note that $m > u$
implies $l < m↑\prime < m$; $m < l$ implies $m < m↑\prime < u$;
so the algorithm terminates.
[There are no solutions when $b = 2$. The representation
will be unique iff $0 \in D$; nonunique representation occurs
for example when $D = \{-3, -1, 7\}$, $b = 3$, since $(α)↓3 =
(\overline377\overline3α)↓3$. When $b ≥ 3$ it is not difficult
to show that there are exactly $2↑{b-3}$ solution sets $D$ in
which $|a| < b$ for all $a \in D$. Furthermore the set $D =
\{0$, 1, $2 - ε↓2b↑n$, $3 - ε↓3b↑n$, $\ldotss$, $b - 2 - ε↓{b-2}b↑n$, $b
- 1 - b↑n\}$ gives unique representations, for all $b ≥ 3$ and
$n ≥ 1$, when each $ε↓j$ is 0 or 1.]
\ansno 20. (a) $0.\overline1\overline1\overline1\,\ldots=\overline1.888\,
\ldots=\overline18.{111\atop777}\,\ldots=\overline18{1\atop7}.{222\atop666}\,
\ldots=\cdots=\overline18{123456\atop765432}.{777\atop111}\,\ldots$ has
nine representations.\xskip (b) A ``$D$-fraction'' $.a↓1a↓2\,\ldots$
always lies between $-1/9$ and $+71/9$. Suppose $x$ has ten or more
$D$-decimal representations. Then for sufficiently large $k$,
$10↑kx$ has ten representations that differ to the left of the
decimal point: $10↑kx = n↓1 + f↓1 = \cdots = n↓{10} + f↓{10}$
where each $f↓j$ is a $D$-fraction. By uniqueness of integer representations,
the $n↓j$ are distinct, say $n↓1 < \cdots < n↓{10}$, hence $n↓{10}
- n↓1 ≥ 9$; but this implies $f↓1 - f↓{10} ≥ 9 > 71/9 - (-1/9)$,
a contradiction.\xskip (c) Any number of the form $0.a↓1a↓2 \ldotsm
$, where each $a↓j$ is $-1$ or 8, equals $\overline1.a↓1↑\prime a↓2↑\prime
\,\ldots$ where $a↑\prime↓{j} = a↓j + 9$ (and it even has six
{\sl more} representations $\overline18.a↑{\prime\prime}↓1a↑{\prime\prime}↓2
\ldotsm$, etc.).
\ansno 21. We can convert to such a representation by using
a method like that suggested in the test for converting to balanced
ternary.
In contrast to the systems of exercise 20, zero
can be represented in infinitely many ways, all obtained from
${1\over 2} + \sum ↓{k≥1}(-4{1\over 2}) \cdot 10↑{-k}$ (or
from the negative of this representation) by multiplying it
by a power of ten. The representations of unity are $1{1\over
2} - {1\over 2}$, ${1\over 2} + {1\over 2}$, $5 - 3{1\over
2} - {1\over 2}$, $5 - 4{1\over 2} + {1\over 2}$, $50 - 45
- 3{1\over 2} - {1\over 2}$, $50 - 45 - 4{1\over 2} + {1\over
2}$, etc., where $\pm {1\over 2} = (\pm 4{1\over 2})(10↑{-1}
+ 10↑{-2} + \cdotss)$. [{\sl AMM \bf 57} (1950), 90--93.]
\ansno 22. Given some approximation $b↓n \ldotsm
b↓1b↓0$ with error $\sum ↓{0≤k≤n}b↓k10↑k
- x > 10↑{-t}$ for $t > 0$, we will show how to reduce the error
by approximately $10↑{-t}$. (The process can be started by finding
a suitable $\sum ↓{0≤k≤n} b↓k10↑k > x$; then a finite
number of reductions of this type will make the error less than
$ε$.) Simply choose $m > n$ so large that the decimal representation
of $-10↑mα$ has a one in position $10↑{-t}$ and no ones in positions
$10↑{-t+1}$, $10↑{-t+2}$, $\ldotss$, $10↑n$. Then $10↑mα + ($a suitable
sum of powers of 10 between $10↑m$ and $10↑n) + \sum ↓{0≤k≤n}
b↓k10↑k \approx \sum ↓{0≤k≤n}b↓k10↑k - 10↑{-t}$.
\ansno 23. The set $S = \leftset\sum ↓{k≥1} a↓kb↑{-k}\relv a↓k \in
D\rightset$ is closed as in exercise 18, hence measurable, and in fact
it has positive measure. Since $bS = \union↓{a\in D}(a + S)$, we
have $b\mu (S) = \mu (bS) ≤ \sum ↓{a\in D}\mu (a + S) = \sum
↓{a\in D}\mu (S) = b\mu (S)$, and we must have $\mu \biglp
(a + S) ∩ (a↑\prime + S)\bigrp = 0$ when $a ≠ a↑\prime \in D$.
Now $T$ has measure zero
since it is a union of countably many sets of the form $10↑k\biglp
n + ((a + S) ∩ (a↑\prime + S))\bigrp$, $a ≠ a↑\prime $, each
of measure zero.
[The set $T$ cannot be empty, since the real numbers
cannot be written as a countable union of disjoint, closed,
bounded sets; cf.\ {\sl AMM \bf 84} (1977), 827--828.
If $D$ has less than $b$ elements, the set of
numbers representable with radix $b$ and digits from $D$ has
measure zero. If $D$ has more than $b$ elements and represents all
reals, $T$ has infinite measure.]
\ansno 24. $\leftset 2a\cdot10↑k+a↑\prime\relv0≤a<5,0≤a↑\prime<2\rightset$
or $\leftset 5a↑\prime\cdot10↑k+a\relv0≤a<5,0≤a↑\prime<2\rightset$,
for $k ≥ 0$. [R. L. Graham and D. W. Matula have shown
that there are no more sets of integer digits with these properties.
And Andrew Odlyzko has shown that the restriction to integers
is superflous, in the sense that if the smallest two elements
of $D$ are 0 and 1, all the digits must be integers.\xskip {\sl Proof:}
Let $S=\leftset\sum↓{k<0}a↓kb↑k\relv a↓k\in D\rightset$
and $X = \leftset(a↓n \ldotsm a↓0)↓b \relv a↓k \in D\rightset$; then $[0, ∞) =
\union↓{x\in X}(x + S)$, and $(x + S) ∩ (x↑\prime + S)$ has measure
zero for $x ≠ x↑\prime \in X$. We have $(0, 1) \subset S$, and
by induction on $m$ we will prove that $(m, m + 1) \subset x↓m
+ S$ for some $x↓m \in X$. Let $x↓m\in X$ be such that $(m, m
+ ε)∩ (x↓m + S)$ has positive measure for all $ε > 0$. Then $x↓m
≤ m$, and $x↓m$ must be an integer lest $x↓{\lfloor x↓m\rfloor}
+ S$ overlap $x↓m + S$ too much. If $x↓m > 0$, the fact that
$(m - x↓m, m - x↓m + 1) ∩ S$ has positive measure implies by
induction that this measure is 1, and $(m, m + 1) \subset x↓m
+ S$ since $S$ is closed. If $x↓m = 0$ and $(m, m + 1) \not
\subset S$, we must have $m < x↑\prime↓{m}
< m + 1$ for some $x↑\prime↓{m} \in X$, where $(m, x↑\prime↓{m})
\subset S$; but then $1 + S$ overlaps $x↑\prime↓{m} + S$.]
[If we drop the restriction $0 \in D$, there {\sl
are} many other cases, some of which are quite interesting,
especially the sets $\{1, 2, 3, 4, 5, 6, 7, 8, 9, 10\}$, $\{1,
2, 3, 4, 5, 51, 52, 53, 54, 55\}$, and $\{2, 3, 4, 5, 6, 52, 53,
54, 55, 56\}$. Alternatively if we allow negative digits we obtain
many other solutions by the method of exercise 19, plus further
sets like $\{-1$, 0, 1, 2, 3, 4, 5, 6, 7, $18\}$ which don't meet
the conditions of that exercise; it appears hopeless to find
a nice characterization of all solutions with negative digits.]
\ansno 25. A positive number whose base $b$ representation has
$m$ consecutive $(b - 1)$'s to the right of the decimal point
must have the form $c/b↑n + (b↑m - \theta )/b↑{n+m}$, where
$c$ and $n$ are nonnegative integers and $0 < \theta ≤1$. So if $u/v$
has this form, we find that $b↑{m+n}u = b↑mcv + b↑mv - \theta
v$. Therefore $\theta v$ is an integer which is a multiple of
$b↑m$. But $0 < \theta v ≤ v < b↑m$. $\biglp$There can be arbitrarily
long runs of other digits $aaaaa$, if $0 ≤ a < b - 1$,
for example in the representation of $a/(b - 1).\bigrp$
%folio 731 galley 2a (C) Addison-Wesley 1978 *
\ansno 26. The proof of ``sufficiency'' is a straightforward
generalization of the usual proof for base $b$, by successively
constructing the desired representation. The proof of ``necessity''
breaks into two parts: If $β↓{n+1}$ is greater than $\sum ↓{k≤n}
c↓kβ↓k$ for some $n$, then $β↓{n+1} - ε$ has no representation
for small $ε$. If $β↓{n+1} ≤ \sum ↓{k≤n} c↓kβ↓k$ for all
$n$, but equality does not always hold, we can show that there are
two representations for certain $x$. [See {\sl Transactions
of the Royal Society of Canada}, series III, {\bf 46} (1952),
45--55.]
\ansno 27. Proof by induction on $|n|$: If $n$ is even we must
take $e↓0 > 0$, and the result follows by induction, since
$n/2$ has a unique such representation. If $n$ is odd, we must
take $e↓0 = 0$, and the problem reduces to representing $-(n
- 1)/2$; if the latter quantity is either zero or one, there
is obviously only one way to proceed, otherwise it has a unique reversing
representation by induction.
[It follows that every positive integer has exactly {\sl two} such
representations with {\sl decreasing} exponents $e↓0>e↓1>\cdots>e↓t$:
one with $t$ even and the other with $t$ odd.]
\ansno 28. A proof like that of exercise 27 may be given. Note
that $a + bi$ is a multiple of $1 + i$ by a complex integer
if and only if $a + b$ is even.
\ansno 29. It suffices to prove that any collection $\{T↓0, T↓1,
T↓2, \ldotss\}$ satisfying Property B may be obtained by collapsing
some collection $\{S↓0, S↓1, S↓2, \ldotss\}$, where $S↓0 = \{0, 1,
\ldotss , b - 1\}$ and all elements of $S↓1$, $S↓2$, $\ldots$ are
multiples of $b$.
To prove the latter statement, we may assume that
$1 \in T↓0$ and that there is a least element $b > 1$ such
that $b \notin T↓0$. We will prove, by induction on $n$,
that if $nb \not\in T↓0$, then $nb + 1$, $nb + 2$, $\ldotss$,
$nb + b - 1$ are not in any of the $T↓j$'s; but if $nb \in
T↓0$, then so are $nb + 1$, $\ldotss$, $nb + b - 1$. The result then
follows with $S↓1 = \leftset nb \relv nb \in T↓0\rightset$, $S↓2 = T↓1$, $S↓3 =
T↓2$, etc.
If $nb \notin T↓0$, then $nb = t↓0 + t↓1+
\cdotss$, where $t↓1$, $t↓2$, $\ldots$ are multiples of $b$; hence
$t↓0 < nb$ is a multiple of $b$. By induction, $(t↓0 + k) +
t↓1 + t↓2 + \cdots$ is the representation of $nb
+ k$, for $0 < k < b$; hence $nb + k \notin T↓j$ for any
$j$.
If $nb \in T↓0$ and $0 < k < b$, let the representation
of $nb + k$ be $t↓0 + t↓1 + \cdotss$. We cannot have $t↓j =
nb + k$ for $j ≥ 1$, lest $nb + b$ have two representations
$(b - k) + \cdots + (nb + k) + \cdots = (nb) + \cdots + b +
\cdotss$. By induction, $t↓0 \mod b = k$; and the representation
$nb = (t↓0 - k) + t↓1 + \cdots$ implies that $t↓0
= nb + k$.
[Reference: {\sl Nieuw Archief voor Wiskunde} (3)
{\bf 4} (1956), 15--17. A finite analog of this result was derived
by P. A. MacMahon, {\sl Combinatory Analysis \bf 1} (1915),
217--223.]
\ansno 30. (a) Let $A↓j$ be the set of numbers $n$ whose representation
does not involve $b↓j$; then by the uniqueness property, $n
\in A↓j$ iff $n + b↓j \notin A↓j$. Consequently $n \in
A↓j$ iff $n + 2b↓j \in A↓j$. It follows that, for $j ≠ k$, $n \in
A↓j ∩ A↓k$ iff $n + 2b↓jb↓k \in A↓j ∩ A↓k$. Let $m$ be the number
of integers $n \in A↓j ∩ A↓k$ such that $0 ≤ n < 2b↓jb↓k$. Then
this interval contains exactly $m$ integers that are in
$A↓j$ but not $A↓k$, exactly $m$ in $A↓k$ but not $A↓j$, and
exactly $m$ in neither $A↓j$ nor $A↓k$; hence $4m = 2b↓jb↓k$.
Therefore $b↓j$ and $b↓k$ cannot both be odd. But at least one
$b↓j$ is odd, of course, since odd numbers can be represented.
(b) According to (a) we can renumber the $b$'s so that $b↓0$ is odd and
$b↓1$, $b↓2$, $\ldots$ are even; then ${1\over2}b↓1$, ${1\over2}b↓2$,
$\ldots$ must also be a binary basis, and the process can be iterated.
(c) If it is a binary basis, we must have positive and negative $d↓k$'s for
arbitrarily large $k$, in order to represent $\pm2↑n$ when $n$ is large.
Conversely, the following algorithm may be used:
\algstep S1. [Initialize.] Set $k←0$.
\algstep S2. [Done?] If $n=0$, terminate.
\algstep S3. [Choose.] If $n$ is odd, include $2↑kd↓k$ in the representation,
and replace $n$ by $(n-d↓k)/2$. Otherwise set $n←n/2$.
\algstep S4. [Advance $k$.] Increase $k$ by 1 and return to S2.\quad\blackslug
\yskip At each step the choice is forced; furthermore
step S3 decreases $|n|$ unless $n=-d↓k$, hence the algorithm must terminate.
(d) Two iterations of steps S2--S4 in the preceding algorithm will transform
$4m→m$, $4m+1→m+5$, $4m+2→m+7$, $4m+3→m-1$. Arguing as in exercise 19, we
need only show that the algorithm terminates for $-2≤n≤8$; all other values of
$n$ are moved toward this interval. In this
range $3 → -1 → -2 → 6 → 8 → 2 → 7 → 0$ and $4 → 1 → 5 → 6$.
Thus $1 = 7 \cdot 2↑0 - 13 \cdot 2↑1 + 7 \cdot 2↑2 - 13 \cdot
2↑3 - 13 \cdot 2↑5 - 13 \cdot 2↑9 + 7 \cdot 2↑{10}$.
{\sl Note:} The choice $d↓0$, $d↓1$, $d↓2$, $\ldots
= 5$, $-3$, 3, 5, $-3$, 3, $\ldots$ also yields a binary basis. For
further details see {\sl Math.\ Comp.\ \bf 18} (1964), 537--546;
A. D. Sands, {\sl Acta Mathematica}, Acad.\ Sci.\ Hung., {\bf 8}
(1957), 65--86.
\ansno 31. (See also the related exercises 3.2.2--11, 4.3.2--13,
4.6.2--22.)
(a) By multiplying numerator and denominator by suitable powers
of 2, we may assume that $u = (\ldotsm u↓2u↓1u↓0)↓2$ and $v =
(\ldotsm v↓2v↓1v↓0)↓2$ are 2-adic integers,
where $v↓0 = 1$. The following computational
method now determines $w$, using the notation $u↑{(n)}$ to stand
for the integer $(u↓{n-1} \ldotsm u↓0)↓2 = u \mod 2↑n$ when
$n > 0$:
Let $w↓0 = u↓0$. For $n = 1$, 2, $\ldotss$, assume
that we have found an integer $w↑{(n)} = (w↓{n-1} \ldotsm w↓0)↓2$
such that $u↑{(n)} ≡ v↑{(n)}w↑{(n)} \modulo{2↑n}$.
Then we have $u↑{(n+1)} ≡ v↑{(n+1)}w↑{(n)}
\modulo {2↑n}$, hence $w↓n = 0$ or 1 according as
$(u↑{(n+1)} - v↑{(n+1)}w↑{(n)})\mod 2↑{n+1}$ is 0 or $2↑n$.
(b) Find the smallest integer $k$ such that $2↑k
≡ 1 \modulo {2n + 1}$. Then we have $1/(2n + 1) = m/(2↑k - 1)$ for
some integer $m$, $1 ≤ m < 2↑{k-1}$. Let $α$ be the $k$-bit binary
representation of $m$; then $(0.ααα\ldotsm)↓2$ times $2n + 1$
is $(0.111\ldotsm)↓2 = 1$ in the binary system, and $(\ldotsm ααα)↓2$
times $2n + 1$ is $(\ldotsm 111)↓2 = -1$ in the 2-adic system.
(c) If $u$ is rational, say $u = m/2↑en$ where
$n$ is odd and positive, the 2-adic representation of $u$ is
periodic, because the set of numbers with periodic expansions
includes $-1/n$ and is closed under the operations of negation,
division by 2, and addition. Conversely, if $u↓{N+λ}=u↓N$ for all
$N≥\mu$, the 2-adic number $(2↑λ-1)2↑{-\mu}u$ is an integer.
(d) The square of any number of the form $(\ldotsm
u↓2u↓11)↓2$ has the form $(\ldotsm 001)↓2$, hence the condition
is necessary. To show the sufficiency, we can use the following procedure
to compute $v = \sqrt{n}$ when $n \mod 8 = 1$:
\algstep H1. [Initialize.] Set $m ← (n - 1)/8$, $k ← 2$, $v↓0 ←
1$, $v↓1 ← 0$, $v ← 1$. (During this algorithm we will have $v=(v↓{k-1}\ldotsm
v↓1v↓0)↓2$ and $v↑2=n-2↑{k+1}m$.)
\algstep H2. [Transform.] If $m$ is even, set $v↓k ← 0$,
$m ← m/2$. Otherwise set $v↓k ← 1$, $m ← (m - v - 2↑{k-1})/2$, $v
← v + 2↑k$.
\algstep H3. [Advance $k$.] Increase $k$ by 1 and return to H2.\quad\blackslug
\ansno 32. A generalization appears
in {\sl Math.\ Comp.\ \bf 29} (1975), 84--86.
%folio 733 galley 2b (C) Addison-Wesley 1978 *
\ansbegin{4.2.1}
\def\\{\raise 3.1pt\hjust{$\scriptscriptstyle\!\!\not\,\,$}} % \\h will be h bar
\ansno 1. $N = (62, +.60\ 22\ 52\ 00)$;
$\\h = (37, +.10\ 54\ 50\ 00)$. Note that $10\\h$ would be $(38, +.01\
05\ 45\ 00)$.
\ansno 2. $b↑{E-q}(1 - b↑{-p})$, $b↑{-q-p}$;\xskip $b↑{E-q}(1 - b↑{-p})$,
$b↑{-q-1}$.
\ansno 3. When $e$ does not have its smallest value,
the most significant ``one'' bit (which appears in all such
normalized numbers) need not appear in the computer word.
\ansno 4. $(51, +.10209877)$; $(50, +.12346000)$; $(53, +.99999999)$.
The third answer would be $(54, +.10000000)$ if the first operand
had been $(45, -.50000000)$.
\ansno 5. If $x ~ y$ and $m$ is an integer then $mb + x ~ mb
+ y$. Furthermore $x~y$ implies $x/b~y/b$, by considering all
possible cases. Another crucial property is that $x$ and $y$ will
round to the same integer, whenever $x ~ y$.
Now if $b↑{-p-2}F↓v ≠ f↓v$ we must have $(b↑{p+2}f↓v)\mod
b ≠ 0$; hence the transformation leaves $f↓v$ unchanged unless
$e↓u - e↓v ≥ 2$. Since $u$ was normalized, it is nonzero and
$|f↓u + f↓v| > b↑{-1} - b↑{-2} ≥ b↑{-2}$: the leading nonzero
digit of $f↓u + f↓v$ must be at most two places to the right
of the radix point, and the rounding operation will convert
$b↑{p+j}(f↓u + f↓v)$ to an integer, where $j ≤ 1$. The proof
will be complete if we can show that $b↑{p+j+1}(f↓u + f↓v)
~ b↑{p+j+1}(f↓u + b↑{-p-2}F↓v)$. By the previous paragraph,
we have $b↑{p+2}(f↓u + f↓v) ~ b↑{p+2}f↓u + F↓v = b↑{p+2}(f↓u
+ b↑{-p-2}F↓v)$, which implies the desired result for all $j
≤ 1$.
Note that, when $b > 2$ is even, such an integer
$F↓v$ always exists; but when $b = 2$ we require $p + 3$ bits
(let $2F↓v$ be an integer). When $b$ is odd, an integer $F↓v$
always exists except in the case of division, when a remainder
of ${1\over 2}b$ is possible.
\ansno 6. (Consider the case $e↓u = e↓v$, $f↓u = -f↓v$ in Program
A.) Register A retains its previous sign, as in {\tt ADD}.
\ansno 7. Say that a number is normalized iff it is zero or
its fraction part satisfies ${1\over 6} < |f| < {1\over 2}$.
A $(p + 1)$-place accumulator suffices for addition and subtraction;
rounding (except during division) is equivalent to truncation.
A very pleasant system indeed! We might represent numbers with
excess-zero exponent, inserted between the first and subsequent
digits of the fraction, and complemented if the fraction is
negative, so that fixed-point order is preserved.
\ansno 8. (a) $(06, +.12345679) \oplus (06, -.12345678)$,\xskip $(01,
+.10345678) \oplus (00, -.94000000)$;\xskip\penalty1000 (b) $(99, +.87654321) \oplus
\hjust{itself}$,\xskip $(99, +.99999999) \oplus (91, +.50000000)$.
\ansno 9. $a = (-50, +.10000000)$, $b = (-41, +.20000000)$, $c =
a$, $d = (-41, +.80000000)$, $y = (11, +.10000000)$.
\ansno 10. $(50, +.99999000) \oplus (55, +.99999000)$.
%folio 735 galley 3a (C) Addison-Wesley 1978 *
\ansno 11. $(50, +.10000001) \otimes (50, +.99999990)$.
\ansno 12. If $0 < |f↓u| < |f↓v|$, then $|f↓u| ≤ |f↓v| - b↑{-p}$;
hence $1/b < |f↓u/f↓v| ≤ 1 - b↑{-p}/|f↓v| < 1 - b↑{-p}$. If
$0 < |f↓v| ≤ |f↓u|$, we have $1/b < |f↓u/f↓v|/b ≤ \biglp(1 - b↑{-p})/(1/b)\bigrp/b
= 1 - b↑{-p}$.
\ansno 13. See J. Michael Yohe, {\sl IEEE Transactions} {\bf C-22}
(1973), 577--586; cf.\ also exercise 4.2.2--24.
\mixans 14. {⊗FIX⊗STJ⊗9F⊗Float-to-fix subroutine:\cr
⊗⊗STA⊗TEMP\cr
⊗⊗LD1⊗TEMP(EXP)⊗$\rI1 ← e$.\cr
⊗⊗SLA⊗1⊗$\rA ← \pm\,f\,f\,f\,f\,0$.\cr
⊗⊗JAZ⊗9F⊗Is input zero?\cr
⊗⊗DEC1⊗1\cr
⊗⊗CMPA⊗=0=(1:1)⊗If leading byte is zero,\cr
⊗⊗JE⊗*-4⊗\quad shift left again.\cr
\\
⊗⊗ENN1⊗-Q-4,1\cr
⊗⊗J1N⊗FIXOVFLO⊗Is magnitude too large?\cr
\\
⊗⊗ENTX⊗0\cr
⊗⊗SRAX⊗0,1\cr
\\
⊗⊗CMPX⊗=1//2=\cr
⊗⊗JL⊗9F\cr
⊗⊗JG⊗*+2\cr
⊗⊗JAO⊗9F\cr
\\
⊗⊗STA⊗*+1(0:0)⊗Round, if necessary.\cr
⊗⊗INCA⊗1⊗Add $\pm 1$ (overflow is impossible).\cr
⊗9H⊗JMP⊗*⊗Exit from subroutine.\quad\blackslug\cr}
\mixans 15. {⊗FP⊗STJ⊗EXITF⊗Fractional part subroutine:\cr
⊗⊗JOV⊗OFLO⊗Ensure overflow is off.\cr
⊗⊗STA⊗TEMP⊗$\hjust{\tt TEMP} ← u$.\cr
⊗⊗ENTX⊗0\cr
⊗⊗SLA⊗1⊗$\rA ← f↓u$.\cr
\\
⊗⊗LD2⊗TEMP(EXP)⊗$\rI2 ← e↓u$.\cr
⊗⊗DEC2⊗Q\cr
⊗⊗J2NP⊗*+3\cr
⊗⊗SLA⊗0,2⊗Remove integer part of $u$.\cr
⊗⊗ENT2⊗0\cr
⊗⊗JANN⊗1F\cr
\\
⊗⊗ENN2⊗0,2⊗Fraction is negative: find\cr
⊗⊗SRAX⊗0,2⊗\quad its complement.\cr
⊗⊗ENT2⊗0\cr
⊗⊗JAZ⊗*+2\cr
⊗⊗INCA⊗1\cr
⊗⊗ADD⊗WM1⊗Add word size minus one.\cr
\\
⊗1H⊗INC2⊗Q⊗Prepare to normalize the answer.\cr
⊗⊗JMP⊗NORM⊗Normalize, round, and exit.\cr
⊗8H⊗EQU⊗1(1:1)\cr
⊗WM1⊗CON⊗8B-1,8B-1(1:4)⊗Word size minus one\quad\blackslug\cr}
\ansno 16. If $|c| ≥ |d|$, then set $r ← d \odiv c$,
$s ← c \oplus (r \otimes d)$; $x ← \biglp a \oplus (b \otimes r)\bigrp
\odiv s$, $y ← \biglp b \ominus (a \otimes r)\bigrp \odiv
s$. Otherwise set $r ← c \odiv d$, $s ← d \oplus (r \otimes
c)$; $x ← \biglp (a \otimes r) \oplus b\bigrp \odiv s$, $y ← \biglp
(b \otimes r) \ominus a\bigrp \odiv s$. Then $x + iy$ is the
desired approximation to $(a + bi)/(c + di)$. [{\sl CACM \bf 5} (1963),
435. Other algorithms for complex arithmetic and function evaluation
are given by P. Wynn, {\sl BIT \bf 2} (1962), 232--255; see
also Paul Friedland, {\sl CACM \bf 10} (1967), 665.]
\ansno 17. See Robert Morris, {\sl IEEE Transactions} {\bf C-20}
(1971), 1578--1579. Error analysis is more difficult with such
systems, so interval arithmetic is correspondingly more desirable.
\ansno 18. For positive numbers: shift fraction left until $f↓1
= 1$, then round, then if the fraction is zero (rounding overflow)
shift it right again. For negative numbers: shift fraction left
until $f↓1 = 0$, then round, then if the fraction is zero (rounding
underflow) shift it right again.
\ansno 19. $\biglp 43 + (1$ if $e↓v < e↓u) - (1$ if fraction overflow$)
- (10$ if result zero$) + (4$ if magnitude
is rounded up$) + (1$ if first rounding digit is $b/2) + (5$ if
rounding digits are $b/2\ 0 \ldotsm 0) + (7$ if rounding overflow$)
+ 7N + A(-1 + (11$ if $N > 0))\bigrp u$, where $N$ is the number
of left shifts during normalization, $A = 1$ if rX receives
nonzero digits (otherwise $A = 0$). The maximum time of $73u$
occurs for example when $$u = +50\ 01\ 00\ 00\ 00,\quad v = -46\ 49\ 99\
99\ 99,\quad b = 100.$$ [The average time, considering the
data in Section 4.2.4, will be about $45{1\over 2}u$.]